AI-powered data engineering: How machine learning is revolutionizing ETL and data pipelines

Tandon, Yaman (2025) AI-powered data engineering: How machine learning is revolutionizing ETL and data pipelines. World Journal of Advanced Engineering Technology and Sciences, 15 (3). pp. 118-125. ISSN 2582-8266

[thumbnail of WJAETS-2025-0858.pdf] Article PDF
WJAETS-2025-0858.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download ( 548kB)

Abstract

The integration of artificial intelligence into data engineering processes represents a paradigmatic shift in how organizations manage, process, and derive value from their data assets. This comprehensive technical review examines the transformative impact of machine learning on traditional Extract, Transform, Load (ETL) workflows and data pipelines. Starting with intelligent data extraction capabilities that leverage natural language processing and computer vision, continuing through adaptive transformation logic and smart loading optimization, AI enhances every aspect of the data engineering lifecycle. Advanced anomaly detection and automated quality control mechanisms enable proactive identification of issues before they impact downstream systems. Reinforcement learning algorithms optimize resource allocation while self-tuning pipelines continuously refine operational parameters without human intervention. Despite significant benefits, organizations face substantial implementation challenges including explainability limitations, skills gaps, legacy system integration, and governance considerations. The emerging landscape features knowledge graphs for semantic understanding, generative AI for pipeline creation, and cross-organizational data fabrics with embedded intelligence innovations that collectively blur traditional boundaries between data engineering and data science disciplines.

Item Type: Article
Official URL: https://doi.org/10.30574/wjaets.2025.15.3.0858
Uncontrolled Keywords: AI-Powered Data Engineering; Intelligent ETL Automation; Anomaly Detection; Self-Tuning Pipelines; Explainable AI
Depositing User: Editor Engineering Section
Date Deposited: 16 Aug 2025 12:46
Related URLs:
URI: https://eprint.scholarsrepository.com/id/eprint/4371