Gajula, Mohan (2025) The evolution of data engineering: From ETL to real-time, AI-driven pipelines. World Journal of Advanced Research and Reviews, 26 (2). pp. 3273-3280. ISSN 2581-9615
![WJARR-2025-1824.pdf [thumbnail of WJARR-2025-1824.pdf]](https://eprint.scholarsrepository.com/style/images/fileicons/text.png)
WJARR-2025-1824.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.
Abstract
The field of data engineering has transformed dramatically, evolving from traditional Extract, Transform, Load (ETL) processes toward sophisticated real-time, AI-enhanced data pipelines. This comprehensive article examines this transition, beginning with an assessment of conventional ETL limitations before exploring the revolutionary impact of streaming technologies such as Apache Kafka and Apache Flink. It extends to cloud-native architectures that have reshaped data infrastructure through platforms like Snowflake and Databricks, while highlighting the growing importance of advanced observability frameworks. The article further investigates how artificial intelligence and automation are fundamentally altering data engineering practices through self-healing pipelines and intelligent workload management. For organizations navigating this evolving landscape, this analysis provides strategic insights into emerging trends and practical preparation for the increasingly AI-driven future of data systems.
Item Type: | Article |
---|---|
Official URL: | https://doi.org/10.30574/wjarr.2025.26.2.1824 |
Uncontrolled Keywords: | Data Pipeline Modernization; Real-Time Streaming Architecture; Cloud-Native Data Infrastructure; AI-Driven Automation; Data Observability |
Depositing User: | Editor WJARR |
Date Deposited: | 20 Aug 2025 11:34 |
Related URLs: | |
URI: | https://eprint.scholarsrepository.com/id/eprint/3405 |