Forecasting the Spread of Dengue Outbreaks with a Synthesis of Machine Learning Models Utilizing Exogenous Variables
DOI:
https://doi.org/10.37266/ISER.2025v12i1.pp13-28Keywords:
Dengue Fever, Forecasting, Exogenous Variables, Stagnant Water, Machine LearningAbstract
Dengue fever, a viral mosquito-borne disease, affects four billion people worldwide, posing economic and health burdens. Unfortunately, there are no antiviral drugs to treat dengue infections, requiring patients to rely solely on palliative treatment. Forecasting future epidemics will aid public officials in implementing mitigation efforts by predicting dengue cases. The purpose of this study was to develop a machine learning model that forecasts the incidence of dengue outbreaks temporally and geographically by utilizing eco-climatic and socioeconomic factors. Methods included preprocessing monthly dengue cases, precipitation, temperature, and socioeconomic datasets from seven countries (between 2014 and 2023) before performing a principal component analysis. A novel topographical feature applied to the model was stagnant water, a critical breeding ground for mosquitoes. A ridge regression technique was used to manage multicollinearity within the data before applying it to the seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) model, which accounts for the seasonality aspect of the variables being examined. Overall, the forecasting algorithm was capable of accurately predicting dengue incidence up to at least six months in advance with a mean absolute error of 2.420e-6. When the novel feature of stagnant water was removed from the datasets, the prediction’s accuracy significantly decreased when forecasting for the same time period of six months in advance, demonstrating its importance as a feature when forecasting dengue. Therefore, this algorithm can assist public health officials with planning proactive measures, significantly diminishing economic stress and dengue transmission, thus improving the quality of life in dengue-endemic countries.
References
Brownlee, J. (2019, August 21). A Gentle Introduction to SARIMA for Time Series Forecasting in Python. Retrieved from https://machinelearningmastery.com/sarima-for-time-series-forecasting-in-python/
Dahmana, H. & Mediannikov, O. (2017). Mosquito-borne diseases emergence/resurgence and how to effectively control it biologically. Pathogens, 9(4). https://doi.org/10.3390/pathogens9040310
Data, M. N. (n.d.). Data collections: Earth System Data Explorer | My NASA Data. Retrieved from https://mynasadata.larc.nasa.gov/basic-page/data-collections-earth-system-data-explorer
Dengue emergency in the Americas: time for a new continental eradication plan. (2023). The Lancet Regional Health - Americas, 22. https://doi.org/10.1016/j.lana.2023.100539
Dengue – the Region of the Americas. (2023, July 19). Retrieved from https://www.who.int/emergencies/disease-outbreak-news/item/2023-DON475
Farooq, Z., Rocklöv, J., Wallin, J., Abiri, N., Sewe, M., Sjödin, H., & Semenza, J. (2022). Artificial intelligence to predict West Nile virus outbreaks with eco-climatic drivers. The Lancet Regional Health - Europe. https://doi.org/10.1016/j.lanepe.2022.100370
FreshExplorer. (n.d.). Retrieved from https://map.sdg661.app/#
Global economy, world economy. (n.d.). Retrieved from https://www.theglobaleconomy.com/
Gutiérrez, L. A. (n.d.). PAHO/WHO Data - National Dengue fever cases | PAHO/WHO. Retrieved from https://www3.paho.org/data/index.php/en/mnu-topics/indicadores-dengue-en/dengue-nacional-en/252-dengue-pais-ano-en.html
Hii, Y. L., Rocklöv, J., Wall, S., Ng, L. C., Tang, C. S., & Ng, N. (2012). Optimal lead time for dengue forecast. PLOS Neglected Tropical Diseases, 6(10). https://doi.org/10.1371/journal.pntd.0001848
How Dengue Spreads. (2024, May 14). Retrieved from https://www.cdc.gov/dengue/transmission/index.html
Jaadi, Z. (2024, February 23). Principal Component Analysis (PCA): A Step-by-Step Explanation. Retrieved from https://builtin.com/data-science/step-step-explanation-principal-component-analysis
Laserna, A., Barahona-Correa, J., Baquero, L., & Castañeda-Cardona, C. (2018). Economic impact of dengue fever in Latin America and the Caribbean: a systematic review. Revista Panamericana de Salud Pública, 42. https://doi.org/10.26633/RPSP.2018.111
Life cycle of Aedes mosquitoes. (2024, April 16). Retrieved from https://www.cdc.gov/mosquitoes/about/life-cycle-of-aedes-mosquitoes.html
Morgan, J., Strode, C., & Salcedo-Sora, J. (2021). Climatic and socio-economic factors supporting the co-circulation of dengue, Zika and chikungunya in three different ecosystems in Colombia. PLOS Neglected Tropical Diseases. https://doi.org/10.1371/journal.pntd.0009259
Naish, S., Dale, P., Mackenzie, J. S., McBride, J., Mengersen, K., & Tong, S. (2014). Climate Change and Dengue: A Critical and Systematic Review of Quantitative Modelling Approaches. BMC Infectious Diseases, 14(1). https://doi.org/10.1186/1471-2334-14-167
National Centers for Environmental Information (NCEI). (n.d.). Search | Climate Data Online (CDO) | National Climatic Data Center (NCDC). Retrieved from https://www.ncdc.noaa.gov/cdo-web/search
Navelski, J., & Odongo, K. (2021). Making Use of PCA in the Presence of Multicollinearity: An Application to Predicting Body Fat Percentage. Washington State University. https://s3.wp.wsu.edu/uploads/sites/2762/2022/10/PCA_and_Multicollinearity.pdf
Seasonal-Trend decomposition using LOESS (STL). (n.d.). Retrieved from https://www.statsmodels.org/dev/examples/notebooks/generated/stl_decomposition.html
Wang, H., Yao, R., Hou, L., Zhao, J., & Zhao, X. (2021). A Methodology for Calculating the Contribution of Exogenous Variables to ARIMAX Predictions. Proceedings of the Canadian Conference on Artificial Intelligence. https://doi.org/10.21428/594757db.2c2969c0
Published
How to Cite
Issue
Section
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
The copyediting stage is intended to improve the flow, clarity, grammar, wording, and formatting of the article. It represents the last chance for the author to make any substantial changes to the text because the next stage is restricted to typos and formatting corrections. The file to be copyedited is in Word or .rtf format and therefore can easily be edited as a word processing document. The set of instructions displayed here proposes two approaches to copyediting. One is based on Microsoft Word's Track Changes feature and requires that the copy editor, editor, and author have access to this program. A second system, which is software independent, has been borrowed, with permission, from the Harvard Educational Review. The journal editor is in a position to modify these instructions, so suggestions can be made to improve the process for this journal.