Volume 10 , Issue 2 , PP: 20-31, 2025 | Cite this article as | XML | Html | PDF | Full Length Article
Noor Razzaq Abbas 1 * , Ghassan AL-Thabhawee 2 , Isam Bahaa Aldallal 3 , Mostafa Abotaleb 4 , Klodian Dhoska 5
Doi: https://doi.org/10.54216/JAIM.100202
Hierarchical clustering is applied in this research to study world COVID-19 data up to January 2025 and partition the primary clusters of countries based on epidemiological criteria. Total cases, deaths, recoveries, active cases, tests, population, and per-million were the data explored and were standardized and thereafter analyzed employing agglomerative hierarchical clustering with Ward linkage. The assessment yielded an average Silhouette of 38.5%, Davies–Bouldin value of 0.87, and Calinski–Harabasz value of 77.6, reflecting cluster validity in separation. The application of dendrograms and PCA projections to plot identified four clusters, reflecting differences in the severity of COVID-19 impacts and responses. Clustering analysis revealed that the high-burden clusters accounted for almost 45% of global death, while low-burden clusters were predominant in over 40% of nations with fewer than 100,000 accumulated instances. The outcomes illustrate hierarchical clustering as an unsupervised learning approach to analyzing epidemiological data and give quantitative estimates to facilitate comparative public health interventions across communities.
Hierarchical Clustering , COVID-19 , Agglomerative Clustering , Silhouette Score , PCA , Epidemiology , Machine Learning , Unsupervised Learning , Cluster Analysis , Global Health
[1] World Health Organization, Joint external evaluation tool: international health regulations, 2018. [Online]. Available: https://apps.who.int/iris/bitstream/handle/10665/259961/9789241550222- eng.pdf?sequence=1. [Accessed: January. 9, 2025].
[2] M. Moore, B. Gelfeld, A. Okunogbe, and C. Paul, “Identifying future disease hot spots: infectious disease vulnerability index,” Rand Health Q., vol. 6, no. 5, 2017. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/28845357
[3] M. Moore, B. Gelfeld, A. Okunogbe, and C. Paul, Identifying future disease hot spots: infectious disease vulnerability index. Santa Monica, CA: RAND Corporation, 2016.
[4] World Health Organization, “WHO Director-General’s opening remarks at the media briefing on COVID-19 – 11 March 2020.” [Online]. Available: https://www.who.int/directorgeneral/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-oncovid-19---11-march-2020
[5] Worldometer, “Coronavirus worldwide graphs.” [Online]. Available: https://www.worldometers.info/coronavirus/worldwide-graphs/#total-deaths. [Accessed: January. 9, 2025].
[6] T. Hale, N. Angrist, R. Goldszmidt, et al., “A global panel database of pandemic policies (Oxford COVID-19 government response tracker),” Nat. Hum. Behav., vol. 5, pp. 529–538, 2021, doi: 10.1038/s41562-021-01079-8.
[7] PreventEpidemics, “Joint External Evaluation (JEE) scores.” [Online]. Available: https://preventepidemics.org/wp-content/uploads/excel/all-countries.xlsx. [Accessed: January. 9, 2025].
[8] J. H. Ward, “Hierarchical grouping to optimize an objective function,” J. Am. Stat. Assoc., vol. 58, pp. 236–244, 1963, doi: 10.1080/01621459.1963.10500845.
[9] D. T. Jamison, L. J. Lau, K. B. Wu, et al., “Country performance against COVID-19: rankings for 35 countries,” BMJ Glob. Health, vol. 5, e003047, 2020, doi: 10.1136/bmjgh-2020-003047.
[10]S. Kumar, “Use of cluster analysis to monitor novel coronavirus-19 infections in Maharashtra, India,” Indian J. Med. Sci., vol. 72, pp. 44–48, 2020, doi: 10.25259/IJMS_68_2020.
[11]P. Sengupta, B. G. Ghosh, and S. SenRoy, “An analysis of COVID-19 clusters in India – two case studies on Nizamuddin and Dharavi,” Research Square, 2020, doi: 10.21203/rs.3.rs-68814/v1.
[12]M. L. F. Nascimento, “A multivariate analysis on spatiotemporal evolution of COVID-19 in Brazil,” Infect. Dis. Model., vol. 5, pp. 670–680, 2020, doi: 10.1016/j.idm.2020.08.012.
[13] “K-means clustering of COVID-19 cases in Indonesia’s Provinces,” Proc. Int. Conf. Global Optimization and Its Applications, Nov. 21–22, 2020.
[14]N. James and M. Menzies, “COVID-19 in the United States: trajectories and second surge behavior,” Chaos, vol. 30, p. 091102, 2020, doi: 10.1063/5.0024204.
[15]V. Zarikas, S. G. Poulopoulos, Z. Gareiou, et al., “Clustering analysis of countries using the COVID-19 cases dataset,” Data Brief, vol. 31, p. 105787, 2020, doi: 10.1016/j.dib.2020.105787.
[16]M. R. Mahmoudi, D. Baleanu, Z. Mansor, et al., “Fuzzy clustering method to compare the spread rate of COVID-19 in the high risks countries,” Chaos Solitons Fractals, vol. 140, p. 110230, 2020, doi: 10.1016/j.chaos.2020.110230.
[17]O. Pasin, “Clustering of countries in terms of deaths and cases of COVID-19,” J. Health Soc. Sci., vol. 5, pp. 587–594, 2020.
[18]A. Ramadan, A. Kamel, A. Taha, et al., “A multivariate data analysis approach for investigating daily statistics of countries affected with COVID-19 pandemic,” Heliyon, vol. 6, p. e05575, 2020, doi: 10.1016/j.heliyon. 2020.e05575.
[19]N. James and M. Menzies, “Cluster-based dual evolution for multivariate time series: analyzing COVID-19,” Chaos, vol. 30, p. 061108, 2020, doi: 10.1063/5.0013156.
[20]M. M. Kavanagh, R. Singh, and S. R. Democracy, “Democracy, capacity, and coercion in pandemic response: COVID-19 in comparative political perspective,” J. Health Polit. Policy Law, vol. 45, pp. 997–1012, 2020, doi: 10.1215/03616878-8641530.
[21]Global Health Security (GHS) Index. [Online]. Available: http://www.ghsindex.org. [Accessed: Aug. 9, 2021].
[22]B. Oppenheim, M. Gallivan, N. K. Madhav, et al., “Assessing global preparedness for the next pandemic: development and application of an epidemic preparedness index,” BMJ Glob. Health, vol. 4, p. e001157, 2019, doi: 10.1136/bmjgh-2018-001157.
[23]W. J. Wiersinga, A. Rhodes, A. C. Cheng, et al., “Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): a review,” JAMA, vol. 324, pp. 782–793, 2020, doi: 10.1001/jama.2020.12839.
[24]S. Richardson, J. S. Hirsch, J. Narasimhan, et al., “Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City Area,” JAMA, vol. 323, pp. 2052–2059, 2020, doi: 10.1001/jama.2020.6775.
[25]D. M. Weinberger, J. Chen, T. Cohen, et al., “Estimation of excess deaths associated with the COVID-19 pandemic in the United States, March to May 2020,” JAMA Intern. Med., vol. 180, pp. 1336–1344, 2020, doi: 10.1001/jamainternmed.2020.3391.
[26]J. Dumlao, “Global COVID-19 Statistics (Jan 2025),” Kaggle, 2025. [Online]. Available: https://www.kaggle.com/datasets/jocelyndumlao/global-covid-19-statistics-jan-2025.
[27]Alhasani T. A., Alkattan H., Ali A., El-Kenawy E.-S. M., and Eid M. M., “A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining,” Journal of Artificial Intelligence and Metaheuristics, vol. 4, no. 2, pp. 08-17, 2023, doi: 10.54216/JAIM.040201.
[28]Doaa S. Khafaga, Hussein Alkattan, Alhumaima A. Subhi, “Evaluating the Effect of Optimized Voting Using Hybrid Particle Swarm and Grey Wolf Algorithm on the Classification of the Zoo Dataset,” Journal of Artificial Intelligence and Metaheuristics, vol. 2, no. 1, pp. 08-15, 2022, doi: 10.54216/JAIM.020101.
[29]H. K. Al-Mahdawi, M. Abotaleb, H. Alkattan, E.-S. M. El-Kenawy, and E. M. Mohamed, “Solving the Inverse Initial Value Problem for the Heat Conductivity Equation by Using the Picard Method,” Journal of Artificial Intelligence and Metaheuristics, vol. 2, no. 2, pp. 46-55, 2022, doi: 10.54216/JAIM.020205.
[30]H. Alkattan, S. K. Towfek, and M. Y. Shams, “Tapping into Knowledge: Ontological Data Mining Approach for Detecting Cardiovascular Disease Risk Causes Among Diabetes Patients,” Journal of Artificial Intelligence and Metaheuristics, vol. 4, no. 1, pp. 08-15, 2023, doi: 10.54216/JAIM.040101.
[31]Noor Razzaq Abbas, Hussein Alkattan, Hamidreza Rabiei-Dastjerdi, Mohamed Saber, and Marwa M. Eid, “Monthly Solar Prediction Using Machine Learning: Diyala Governorate, Iraq as a Case Study,” Journal of Artificial Intelligence and Metaheuristics, vol. 5, no. 2, pp. 41-46, 2023, doi: 10.54216/JAIM.050204.
[32]Al-Mahdawi H. K., Alhumaima Ali Subhi, Hussein Alkattan, Mohamed Saber, Marwa M. Eid, Anfal A. Sabti Al-Mahdawi, and Jinan A. M. Al-Saddaee, “Solving Initial Value Problem in Composite Materials for Heat Equation,” Journal of Artificial Intelligence and Metaheuristics, vol. 6, no. 1, pp. 08-17, 2023, doi: 10.54216/JAIM.060101.
[33]Al-Seyday, T. Qenawy; Hussein Alkattan; Amany Khaled, “Securing DNS over HTTPS: A Machine Learning Study on Traffic Classification Using DoHBrw-2020,” Journal of Artificial Intelligence and Metaheuristics, vol. 7, no. 2, pp. 73-81, 2024, doi:10.54216/JAIM.070207.
[34] A. Gupta and R. Kumar, “A Hybrid Deep Learning Approach for Brain Stroke Detection Using MRI Images,” Journal of Medical Systems, vol. 46, no. 3, pp. 1-10, 2022. doi: 10.1007/s10916-022-01844-7.