Prediction of PM2.5 and PM10 Concentrations Using XGBoost and LightGBM Algorithms: A Case Study in Lima, Peru
DOI:
https://doi.org/10.26439/interfases2024.n020.7417Keywords:
air pollution, air quality, meteorological data, machine learning, XGBoost, LightGBMAbstract
Air pollution is a major problem that affects both human health and the environment, causing millions of premature deaths annually worldwide and severely degrading the state of the planet. Exposure to fine particulate matter, which is highly hazardous, enables these particles to penetrate deeply into the lungs and lead to serious health issues, including a reduction in life expectancy by more than two years. In response to this problem, it is crucial to identify effective ways to monitor the levels of these pollutants in our daily surroundings. This article presents a case study conducted in the district of San Borja, Lima, Peru, where prediction models for PM2.5 and PM10 were implemented using the XGBoost and LightGBM algorithms. Employing data from the SENAMHI portal and a correlation analysis of variables, two different scenarios were developed for training the models. In scenario 1, prediction models for PM2.5 and PM10 were trained using all available meteorological and pollution variables. In scenario 2, the models were trained for PM2.5 excluding the PM10 variable, and vice versa. The results showed that both models achieved high accuracy, measured by the coefficient of determination, with no statistically significant difference indicating the superiority of either model. Furthermore, the analysis of the proposed scenarios revealed that excluding key variables can result in significantly less accurate predictions, potentially undermining the effectiveness of environmental management strategies.
Downloads
References
Ameer, S., Shah, M, A., Khan, A., Song, H., Maple, C., Islam, S. U., & Asghar, M. N. (2019). Comparative analysis of machine learning techniques for predicting air quality in smart cities, IEEE Access, 7, 128325–128338. https://doi.org/10.1109/ACCESS.2019.2925082
Amuthadevi, C., Vijayan, D. S. & Ramachandran, V. (2021). Development of air quality monitoring (AQM) models using different machine learning approaches, Journal of Ambient Intelligence and Humanized Computing, 13(1), 33. https://doi.org/10.1007/s12652-020-02724-2
Ayus, I., Natarajan, N. & Gupta, D. (2023). Comparison of machine learning and deep learning techniques for the prediction of air pollution: a case study from China, Asian Journal of Atmospheric Environment, 17, Article 4. https://doi.org/10.1007/s44273-023-00005-w
Bai, Y., Li, Y., Wang, X., Xie, J., & Li, C. (2016). Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions, Atmospheric Pollution Research, 7(3), 557–566. https://doi.org/10,1016/j.apr.2016.01.004
Cordova, C. H., Portocarrero, M. N. L., Salas, R., Torres, R., Canas, P., & López-Gonzales J. L. (2021). Air quality assessment and pollution forecasting using artificial neural networks in Metropolitan Lima-Peru, Scientific Reports, 11, Article 24232. https://doi.org/10.1038/s41598-021-03650-9
Gokul, P. R., Mathew, A., Bhosale, A., & Nair, A. T. (2023). Spatio-temporal air quality analysis and PM2,5 prediction over Hyderabad City, India using artificial intelligence techniques, Ecological Informatics, 76, Article 102067. https://doi.org/10.1016/j.ecoinf.2023.102067
Gryech, I., Ghogho, M., Elhammouti, H., Sbihi, N., & Kobbane, A. (2020). Machine learning for air quality prediction using meteorological and traffic related features, Journal of Ambient Intelligence and Smart Environments, 12(5), 379–391. https://doi.org/10.3233/AIS-200572
Liang, Y-C., Maimury, Y., Chen, A. H-L., & Juarez, J. R. C. (2020). Machine learning-based prediction of air quality, Applied Sciences, 10(24), Article 9151. https://doi.org/10.3390/app10249151
Liu, X., Zhao, K., Liu, Z., & Wang, L. (2023). PM2,5 Concentration Prediction Based on LightGBM Optimized by Adaptive Multi-Strategy Enhanced Sparrow Search Algorithm, Atmosphere, 14(11), Article 1612. https://doi.org/10.3390/atmos14111612
Martín-Baos, J. Á., Rodriguez-Benitez, L., García-Ródenas, R., & Liu, J. (2022). IoT based monitoring of air quality and traffic using regression analysis, Applied Soft Computing, 115, Article 108282. https://doi.org/10.1016/j.asoc.2021.108282
Pan, B. (2018). Application of XGBoost algorithm in hourly PM2,5 concentration prediction, IOP Conference Series: Earth and Environmental Science, 113, Article 012127. https://doi.org/10.1088/1755-1315/113/1/012127
Servicio Nacional de Meteorología e Hidrología del Perú. (2024). Monitoreo de la Calidad de Aire, para Lima Metropolitana. https://www.senamhi.gob.pe/?p=calidad-del-aire-estacion&e=112194
Shakya, D., Deshpande, V., Goyal, M. K., & Agarwal, M. (2023). PM2,5 air pollution prediction through deep learning using meteorological, vehicular, and emission data: A case study of New Delhi, India, Journal of Cleaner Production, 427, Article 139278. https://doi.org/10.1016/j.jclepro.2023.139278
Sulaimon, I. A., Alaka, H., Olu-Ajayi, R., Ahmad, M., Ajayi, S. & Hye, A. (2022). Effect of traffic data set on various machine-learning algorithms when forecasting air quality, Journal of Engineering, Design and Technology, 22(3), 1030–1056. https://doi.org/10.1108/JEDT-10-2021-0554
Wang, Z., Chen, P., Wang, R., An, Z., & Qiu, L. (2023). Estimation of PM2,5 concentrations with high spatiotemporal resolution in Beijing using the ERA5 dataset and machine learning models, Advances in Space Research, 71(8), 3150–3165. https://doi.org/10.1016/j.asr.2022.12.016
World Health Organization. (2021). WHO global air quality guidelines: particulate matter (PM2,5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide. https://apps.who.int/iris/handle/10665/345329
World Health Organization. (2022). Air pollution. https://www.who.int/health-topics/air-pollution#tab=tab_1
Yang, W., Deng, M., Xu, F., & Wang, H. (2018). Prediction of hourly PM2,5 using a spacetime support vector regression model, Atmospheric Environment, 181, 12–19. https://doi.org/10.1016/j.atmosenv.2018.03.015
Zhang, D., & Woo, S. S. (2020). Real time localized air quality monitoring and prediction through mobile and fixed IoT sensing network, IEEE Access, 8, 89584–89594. https://doi.org/10.1109/ACCESS.2020.2993547
Zhang, K., Yang, X., Cao, H., Thé, J., Tan, Z., & Yu, H. (2023). Multi-step forecast of PM2,5 and PM10 concentrations using convolutional neural network integrated with spatial–temporal attention and residual learning, Environment International, 171, Article 107691. https://doi.org/10.1016/j.envint.2022.107691
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under an Attribution 4.0 International (CC BY 4.0) License. that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Last updated 03/05/21
