Volume 19 , Issue 2 , PP: 341-366, 2025 | Cite this article as | XML | Html | PDF | Full Length Article
Sudhirvarma Sagiraju 1 , Jnyana Ranjan Mohanty 2 , Anima Naik 3 *
Doi: https://doi.org/10.54216/FPA.190225
Accurate disease prediction is essential for enabling preventive healthcare and reducing the burden of chronic illnesses. This study introduces an innovative multi-disease prediction framework leveraging machine learning and optimization techniques to address limitations in precision and scope present in prior research. Specifically, we focus on predicting five major diseases—diabetes, heart disease, kidney disease, liver disease, and breast cancer—by employing the Social Group Optimization (SGO) algorithm to fine-tune the Random Forest (RF) classifier's hyperparameters.The proposed SGO-optimized RF model optimizes seven critical parameters - n_estimators, max_depth, min_samples_split, min_samples_leaf, max_features, bootstrap, and criterion simultaneously, significantly enhancing the model's performance. Our approach, applied to five disease datasets, achieves notable accuracy improvements: 98.25 When tested on diverse datasets, the model achieves exceptional accuracy: 98.25% for breast cancer, 84.62% for liver disease, 93.44% for heart disease, 82.47% for diabetes, and 100% for chronic kidney disease. On average, the SGO-optimized RF outperforms existing methods with a 2.3% accuracy improvement across datasets. This research highlights the transformative potential of SGO-based optimization in advancing the accuracy and reliability of predictive models. The results demonstrate the framework's applicability in clinical scenarios, providing precise and actionable insights that support early diagnosis and improve patient outcomes.
SGO , Random forest , Accuracy , Hyperparameters , Healthcare , Chronic disease prediction
[1] M. Feurer, J. Springenberg, and F. Hutter, “Initializing Bayesian hyperparameter optimization via meta-learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015. https://doi.org/10.1609/aaai.v29i1.9354
[2] D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-based hyperparameter optimization through reversible learning,” International Conference on Machine Learning, pp. 2113–2122, 2015.
[3] “arXiv: 1502.03492,” 2015. https://doi.org/10.48550/arXiv.1502.03492
[4] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A novel bandit-based approach to hyperparameter optimization,” Journal of Machine Learning Research, vol. 18, no. 1, pp. 6765–6816, 2017. https://doi.org/10.48550/arXiv.1603.06560
[5] T. L. Paine, C. Paduraru, A. Michi, C. Gulcehre, K. Zolna, A. Novikov, and N. D. Freitas, “Hyperparameter selection for offline reinforcement learning,” arXiv preprint arXiv: 2007.09055, 2020. https://doi.org/10.48550/arXiv.2007.09055
[6] S. C. Smithson, G. Yang, W. J. Gross, and B. H. Meyer, “Neural networks designing neural networks: Multi-objective hyper-parameter optimization,” Proceedings of the 35th International Conference on Computer-Aided Design, pp. 1–8, 2017. https://doi.org/10.1145
[7] H. Tu and V. Nair, “Is one hyperparameter optimizer enough?” Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, pp. 19–25, 2018.
[8] “arXiv: 3278142.3278145,” 2018. https://doi.org/10.1145/3278142.3278145
[9] A. Agrawal, W. Fu, D. Chen, X. Shen, and T. Menzies, “How to DODGE complex software analytics,” IEEE Transactions on Software Engineering, 2019. https://doi.org/10.1109/TSE.2019.2945020
[10] R. Khalid and N. Javaid, “A survey on hyperparameters optimization algorithms of forecasting models in smart grid,” Sustainable Cities and Society, vol. 61, p. 102275, 2020. https://doi.org/10.1016/j.scs.2020.102275
[11] R. Ghawi and J. Pfeffer, “Efficient hyperparameter tuning with grid search for text categorization using kNN approach with BM25 similarity,” Open Computer Science, vol. 9, no. 1, pp. 160–180, 2019. https://doi.org/10.1515/comp-2019-0011
[12] J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,” Journal of Machine Learning Research, vol. 13, no. 2, 2012. https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
[13] V. Nguyen, “Bayesian optimization for accelerating hyperparameter tuning,” IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pp. 302–305, 2019. https://doi.org/10.1109/AIKE.2019.00060
[14] T. Yu and H. Zhu, “Hyper-parameter optimization: A review of algorithms and applications,” arXiv preprint arXiv: 2003.05689, 2020. https://doi.org/10.48550/arXiv.2003.05689
[15] L. Wang, M. Feng, B. Zhou, B. Xiang, and S. Mahadevan, “Efficient hyper-parameter optimization for NLP applications,” Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 2112–2117, 2015. https://doi.org/10.18653/v1/D15-1253
[16] Y. Sun, B. Xue, M. Zhang, and G. G. Yen, “An experimental study on hyper-parameter optimization for stacked auto-encoders,” IEEE Congress on Evolutionary Computation (CEC), pp. 1–8, 2018. https://doi.org/10.1109/CEC.2018.8477921
[17] M. U. Yaseen, A. Anjum, O. Rana, and N. Antonopoulos, “Deep learning hyper-parameter optimization for video analytics in clouds,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 1, pp. 253–264, 2018. https://doi.org/10.1109/TSMC.2018.2840341
[18] J. Haddad, O. Lézoray, and P. Hamel, “3D-CNN for facial emotion recognition in videos,” International Symposium on Visual Computing, pp. 298–309, 2020. https://doi.org/10.1007/978-3-030-64559-523
[19] N. Tran, J. G. Schneider, I. Weber, and A. K. Qin, “Hyper-parameter optimization in classification: To-do or not-to-do,” Pattern Recognition, vol. 103, p. 107245, 2020. https://doi.org/10.1016/j.patcog.2020.107245
[20] S. Satapathy and A. Naik, “Social group optimization (SGO): A new population evolutionary optimization technique,” Complex & Intelligent Systems, vol. 2, no. 3, pp. 173–203, 2016. https://doi.org/10.1007/s40747-016-0022-8
[21] A. Naik, S. C. Satapathy, and A. Abraham, “Modified Social Group Optimization—a meta-heuristic algorithm to solve short-term hydrothermal scheduling,” Applied Soft Computing, vol. 95, p. 106513, 2020. https://doi.org/10.1016/j.asoc.2020.106513
[22] A. Kumar and S. Singh, “An enhanced hybrid optimization algorithm for multi-objective optimization problems,” Swarm and Evolutionary Computation, vol. 59, pp. 1-12, 2021. https://doi.org/10.1016/j.swevo.2020.100737
[23] R. Patel and A. Shah, “A novel approach for solving multi-objective optimization problems using genetic algorithms,” Applied Soft Computing, vol. 112, no. 1, pp. 1-10, 2022. https://doi.org/10.1016/j.asoc.2021.107870
[24] A. Naik et al., “Social group optimization for global optimization of multimodal functions and data clustering problems,” Neural Computing & Applications, vol. 30, no. 1, pp. 271–287, 2018. https://doi.org/10.1007/s00521-016-2614-6
[25] X. Li and Y. Zhang, “A comprehensive review of swarm intelligence-based optimization algorithms for engineering applications,” Engineering Applications of Artificial Intelligence, vol. 95, pp. 1-18, 2020. https://doi.org/10.1016/j.engappai.2020.103887
[26] A. Naik et al., “Non-dominated sorting social group optimization algorithm for multi-objective optimization,” Journal of Scientific & Industrial Research, vol. 80, no. 2, p. 36501, 2021. https://doi.org/10.56042/jsir.v80i02.36501
[27] A. Naik, “Chaotic social group optimization for structural engineering design problems,” Journal of Bionic Engineering, vol. 20, pp. 1852–1877, 2023. https://doi.org/10.1007/s42235-023-00340-2
[28] A. Naik, “Marine predators social group optimization: A hybrid approach,” Evolutionary Intelligence, vol. 17, pp. 2355–2386, 2024. https://doi.org/10.1007/s12065-023-00891-7
[29] H. Zhao and J. Wang, “Adaptive social group optimization for multi-objective problems,” Soft Computing, vol. 27, no. 3, pp. 1-15, 2023. https://doi.org/10.1007/s00500-022-05876-4
[30] L. Chen and Y. Huang, “Hybrid social group optimization algorithm for large-scale optimization problems,” Computers & Operations Research, vol. 135, pp. 1-12, 2021. https://doi.org/10.1016/j.cor.2021.105365
[31] W. H. Organization, “Breast cancer,” 2020. https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/
[32] S. Nayak and D. Gope, “Comparison of supervised learning algorithms for RF-based breast cancer detection,” 2017 Computing and Electromagnetics International Workshop (CEM), pp. 1–6, 2017. https://doi.org/10.1109/cem.2017.7991863
[33] H. Asri, H. Mousannif, H. A. Moatassime, and T. Noel, “Using machine learning algorithms for breast cancer risk prediction and diagnosis,” Procedia Computer Science, vol. 83, pp. 1064–1069, 2016. https://doi.org/10.1016/j.procs.2016.04.224
[34] B. M. Gayathri and C. P. Sumathi, “Comparative study of relevance vector machine with various machine learning techniques used for detecting breast cancer,” 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–5, 2016. https://doi.org/10.1109/iccic.2016.7919576
[35] Y. Khoudfi and M. Bahaj, “Applying best machine learning algorithms for breast cancer prediction and classification,” IEEE Conference Proceedings, 978-1-5386-4225-2, 2018. https://doi.org/10.1109/icecocs.2018.8610632
[36] D. Bazazeh and R. Shubair, “Comparative study of machine learning algorithms for breast cancer detection and diagnosis,” 2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA), pp. 1–4, 2016. https://doi.org/10.1109/ICEDSA.2016.7818560
[37] M. M. Islam, H. Iqbal, M. R. Haque, and M. K. Hasan, “Prediction of breast cancer using support vector machine and K-Nearest neighbors,” 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), pp. 226–229, 2017. https://doi.org/10.1109/R10-HTC.2017.8288944
[38] M. A. Naji, S. E. El Filali, K. Aarika, et al., “Machine learning algorithms for breast cancer prediction and diagnosis,” Procedia Computer Science, vol. 191, pp. 487–492, 2021. https://doi.org/10.1016/j.procs.2021.07.062
[39] S. Sharma, A. Aggarwal, and T. Choudhury, “Breast cancer detection using machine learning algorithms,” 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), pp. 114–118, 2018. https://doi.org/10.1109/CTEMS.2018.8769187
[40] P. P. Sengar, M. J. Gaikwad, and A. S. Nagdive, “Comparative study of machine learning algorithms for breast cancer prediction,” 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 796–801, 2020. https://doi.org/10.1109/ICSSIT48917.2020.9214267
[41] S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE Access, vol. 7, pp. 81542–81554, 2019. https://doi.org/10.1109/ACCESS.2019.2923707
[42] M. S. Amin, Y. K. Chiam, and K. D. Varathan, “Identification of significant features and data mining techniques in predicting heart disease,” Telematics and Informatics, vol. 36, pp. 82–93, 2019. https://doi.org/10.1016/j.tele.2018.11.007
[43] L. Baccour, “Amended fused TOPSIS-VIKOR for classification (ATOVIC) applied to some UCI data sets,” Expert Systems with Applications, vol. 99, pp. 115–125, 2018. https://doi.org/10.1016/j.eswa.2018.01.025
[44] C. A. Cheng and H. W. Chiu, “An artificial neural network model for the evaluation of carotid artery stenting prognosis using a nationwide database,” Proc. 39th Annual Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 2566–2569, 2017. https://doi.org/10.1109/embc.2017.8037381
[45] J. Nahar, T. Imam, K. S. Tickle, and Y. P. P. Chen, “Association rule mining to detect factors which contribute to heart disease in males and females,” Expert Systems with Applications, vol. 40, no. 4, pp. 1086–1093, 2013. https://doi.org/10.1016/j.eswa.2012.08.028
[46] S. Zaman and R. Toufiq, “Codon-based back propagation neural network approach to classify hypertension gene sequences,” Proc. Int. Conf. Elect., Comput., Commun. Eng. (ECCE), pp. 443–446, 2017. https://doi.org/10.1109/ecace.2017.7912945
[47] D. K. Ravish, K. J. Shanthi, N. R. Shenoy, and S. Nisargh, “Heart function monitoring, prediction and prevention of heart attacks: Using artificial neural networks,” Proc. Int. Conf. Contemp. Comput. Inform. (IC3I), pp. 1–6, 2014. https://doi.org/10.1109/ic3i.2014.7019580
[48] W. Zhang and J. Han, “Towards heart sound classification without segmentation using convolutional neural network,” Proc. Comput. Cardiol. (CinC), vol. 44, pp. 1–4, 2017. https://doi.org/10.22489/cinc.2017.254-164
[49] T. M. Alam et al., “A model for early prediction of diabetes,” Informatics in Medicine Unlocked, vol. 16, p. 100204, 2019. https://doi.org/10.1016/j.imu.2019.100204
[50] D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Computer Science, vol. 132, pp. 1578–1585, 2018. https://doi.org/10.1016/j.procs.2018.05.122
[51] N. P. Tigga and S. Garg, “Predicting type 2 diabetes using logistic regression,” Proceedings of the Fourth International Conference on Microelectronics, Computing and Communication Systems: MCCS 2019, pp. 491-500, 2021. https://doi.org/10.1007/978-981-15-5546-6_42
[52] S. A. Diwani and A. Sam, “Diabetes forecasting using supervised learning techniques,” Advances in Computer Science: International Journal, pp. 10–18, 2014. ISSN: 2322-5157.
[53] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Frontiers in Genetics, vol. 9, p. 515, 2018. https://doi.org/10.3389/fgene.2018.00515
[54] H. Polat, H. Danaei Mehr, and A. Cetin, “Diagnosis of chronic kidney disease based on support vector machine by feature selection methods,” Journal of Medical Systems, vol. 41, no. 9, pp. 1–11, 2017. https://doi.org/10.1007/s10916-017-0703-x
[55] A. N. Alharbi and M. A. Alzahrani, “A novel hybrid model for predicting chronic kidney disease using machine learning,” Journal of King Saud University - Computer and Information Sciences, 2020. https://doi.org/10.1016/j.jksuci.2020.09.004
[56] R. Ani, G. Sasi, U. R. Sankar, and O. Deepa, “Decision support system for diagnosis and prediction of chronic renal failure using random subspace classification,” 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1–6, 2016. https://doi.org/10.1109/ICACCI.2016.7732271
[57] M. A. Naji, S. E. El Filali, K. Aarika, et al., “Machine learning algorithms for breast cancer prediction and diagnosis,” Procedia Computer Science, vol. 191, pp. 487–492, 2021. https://doi.org/10.1016/j.procs.2021.07.062
[58] S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE Access, vol. 7, pp. 81542–81554, 2019. https://doi.org/10.1109/ACCESS.2019.2923707
[59] M. S. Amin, Y. K. Chiam, and K. D. Varathan, “Identification of significant features and data mining techniques in predicting heart disease,” Telematics and Informatics, vol. 36, pp. 82–93, 2019. https://doi.org/10.1016/j.tele.2018.11.007
[60] L. Baccour, “Amended fused TOPSIS-VIKOR for classification (ATOVIC) applied to some UCI data sets,” Expert Systems with Applications, vol. 99, pp. 115–125, 2018. https://doi.org/10.1016/j.eswa.2018.01.025
[61] C. A. Cheng and H. W. Chiu, “An artificial neural network model for the evaluation of carotid artery stenting prognosis using a nationwide database,” Proc. 39th Annual Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 2566–2569, 2017. https://doi.org/10.1109/embc.2017.8037381
[62] J. Nahar, T. Imam, K. S. Tickle, and Y. P. P. Chen, “Association rule mining to detect factors which contribute to heart disease in males and females,” Expert Systems with Applications, vol. 40, no. 4, pp. 1086–1093, 2013. https://doi.org/10.1016/j.eswa.2012.08.028
[63] D. K. Ravish, K. J. Shanthi, N. R. Shenoy, and S. Nisargh, “Heart function monitoring, prediction and prevention of heart attacks: Using artificial neural networks,” Proc. Int. Conf. Contemp. Comput. Inform. (IC3I), pp. 1–6, 2014. https://doi.org/10.1109/ic3i.2014.7019580
[64] W. Zhang and J. Han, “Towards heart sound classification without segmentation using convolutional neural network,” Proc. Comput. Cardiol. (CinC), vol. 44, pp. 1–4, 2017. https://doi.org/10.22489/cinc.2017.254-164
[65] T. M. Alam et al., “A model for early prediction of diabetes,” Informatics in Medicine Unlocked, vol. 16, p. 100204, 2019. https://doi.org/10.1016/j.imu.2019.100204
[66] D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Computer Science, vol. 132, pp. 1578–1585, 2018. https://doi.org/10.1016/j.procs.2018.05.122
[67] N. P. Tigga and S. Garg, “Predicting type 2 diabetes using logistic regression,” Proceedings of the Fourth International Conference on Microelectronics, Computing and Communication Systems: MCCS 2019, pp. 491-500, 2021. https://doi.org/10.1007/978-981-15-5546-6_42
[68] S. A. Diwani and A. Sam, “Diabetes forecasting using supervised learning techniques,” Advances in Computer Science: International Journal, pp. 10–18, 2014. ISSN: 2322-5157.
[69] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Frontiers in Genetics, vol. 9, p. 515, 2018. https://doi.org/10.3389/fgene.2018.00515
[70] H. Polat, H. Danaei Mehr, and A. Cetin, “Diagnosis of chronic kidney disease based on support vector machine by feature selection methods,” Journal of Medical Systems, vol. 41, no. 9, pp. 1–11, 2017. https://doi.org/10.1007/s10916-017-0703-x
[71] A. N. Alharbi and M. A. Alzahrani, “A novel hybrid model for predicting chronic kidney disease using machine learning,” Journal of King Saud University - Computer and Information Sciences, 2020. https://doi.org/10.1016/j.jksuci.2020.09.004
[72] A. H. Alshahrani, “Predicting chronic kidney disease using machine learning algorithms: A systematic review,” Journal of King Saud University - Computer and Information Sciences, 2021. https://doi.org/10.1016/j.jksuci.2021.02.002
[73] M. A. Naji and S. E. El Filali, “Machine learning algorithms for breast cancer prediction and diagnosis,” Procedia Computer Science, vol. 191, pp. 487–492, 2021. https://doi.org/10.1016/j.procs.2021.07.062
[74] A. A. A. Alharthi and M. R. Alghamdi, “A comparative study of machine learning classifiers for breast cancer diagnosis,” Journal of King Saud University - Computer and Information Sciences, 2021. https://doi.org/10.1016/j.jksuci.2021.04.003
[75] S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE Access, vol. 7, pp. 81542–81554, 2019. https://doi.org/10.1109/ACCESS.2019.2923707
[76] A. S. Kaur and S. K. Singh, “Heart disease prediction using machine learning techniques,” 2020 3rd International Conference on Computing, Communications and Data Engineering (CCODE), pp. 1–5, 2020. https://doi.org/10.1109/CCODE49329.2020.9203033
[77] D. K. Ravish, K. J. Shanthi, N. R. Shenoy, and S. Nisargh, “Heart function monitoring, prediction and prevention of heart attacks: Using artificial neural networks,” Proc. Int. Conf. Contemp. Comput. Inform. (IC3I), pp. 1–6, 2014. https://doi.org/10.1109/ic3i.2014.7019580
[78] W. Zhang and J. Han, “Towards heart sound classification without segmentation using convolutional neural network,” Proc. Comput. Cardiol. (CinC), vol. 44, pp. 1–4, 2017. https://doi.org/10.22489/cinc.2017.254-164
[79] T. M. Alam et al., “A model for early prediction of diabetes,” Informatics in Medicine Unlocked, vol. 16, p. 100204, 2019. https://doi.org/10.1016/j.imu.2019.100204
[80] D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Computer Science, vol. 132, pp. 1578–1585, 2018. https://doi.org/10.1016/j.procs.2018.05.122
[81] N. P. Tigga and S. Garg, “Predicting type 2 diabetes using logistic regression,” Proceedings of the Fourth International Conference on Microelectronics, Computing and Communication Systems: MCCS 2019, pp. 491-500, 2021. https://doi.org/10.1007/978-981-15-5546-6_42
[82] S. A. Diwani and A. Sam, “Diabetes forecasting using supervised learning techniques,” Advances in Computer Science: International Journal, pp. 10–18, 2014. ISSN: 2322-5157.
[83] Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Frontiers in Genetics, vol. 9, p. 515, 2018. https://doi.org/10.3389/fgene.2018.00515