Sonone, Pranjali Subhash and Khamborkar, Abhay Kamlakar (2025) Integrating statistical and machine learning models for advancing air quality forecasting: A comparative time series analysis. Open Access Research Journal of Science and Technology, 14 (2). 018-028. ISSN 2782-9960
Abstract
Air pollution is a global issue, as it is known to cause various health problems, including skin irritation, respiratory issues, and lung diseases. The environment becomes contaminated due to poor air quality, and inhaling it harms our health. Therefore, it is essential to forecast air quality and take action to protect both humans and the environment in the modern world. In this study, air quality data from the Nagpur region are collected and studied. Several statistical techniques estimate an air quality index based on air pollution data. These statistical models are compared to determine the model's accuracy and precision. Traditional models (ARIMA, SARIMA, and Exponential smoothing) and machine learning techniques (SVM, LSTM, and Prophet) are employed to forecast AQI over time. R-squared, Root Mean Squared Error and Mean Absolute Error were then used to compare the prediction performance of each model. The results demonstrate that while conventional models may still capture linear trends and seasonal patterns, machine learning models outperform them in handling non-linear interactions and long-term dependencies. With the highest R-squared value and the lowest RMSE of any model, LSTM outperformed every model in predictive performance. It was observed that precision can be further increased by considering meteorological factors such as solar radiation, wind direction, wind speed, and relative humidity.
Item Type: | Article |
---|---|
Official URL: | https://doi.org/10.53022/oarjst.2025.14.2.0090 |
Uncontrolled Keywords: | Predictive Modelling; Time Series; Machine Learning; Random Forest; Air Quality Index |
Date Deposited: | 01 Sep 2025 14:01 |
Related URLs: | |
URI: | https://eprint.scholarsrepository.com/id/eprint/5395 |