296 141
Full Length Article
Journal of Intelligent Systems and Internet of Things
Volume 12 , Issue 1, PP: 08-19 , 2024 | Cite this article as | XML | Html |PDF

Title

Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study

  S. Phani Praveen 1 ,   Kotte Sandeep 2 ,   N. Raghavendra Sai 3 ,   Aditi Sharma 4 * ,   Jitendra Pandey 5 ,   Vikas Chouhan 6

1  Department of CSE, PVP Siddhartha Institute of Technology, Vijayawada, A.P, India
    (phani.0713@gmail.com)

2  Department of IT, Dhanekula Institute of Engineering &Technology, Vijayawada, A.P, India
    (kottesandeep@gmail.com)

3  Department of CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India
    (nallagatlaraghavendra@gmail.com)

4  Department of Computer Science and Engineering, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India; IEEE Senior Member, Symbiosis International (Deemed University), Pune, India
    (aditi.sharma@ieee.org)

5  Middle East College, Knowledge Oasis Muscat, Oman
    (jitendra@mec.edu.om)

6  Canadian Institute for Cybersecurity, University of New Brunswick, Canada
    (vikas.chouhan@unb.ca)


Doi   :   https://doi.org/10.54216/JISIoT.120101

Received: August 17, 2023 Revised: November 11, 2023 Accepted: February 11, 2024

Abstract :

The chronic metabolic disorder known as diabetes mellitus, which is defined by hyperglycemia, poses a significant threat to the health of people all over the world. The categorization is broken down into two primary categories: Type 1 and Type 2, with each category having its own unique causes and approaches to treatment. It is very necessary for the effective management of illnesses to have both the prompt detection and the exact prediction of outcomes. The applications of machine learning and data mining are becoming increasingly important as tools in this setting. The current research study analyses the usage of machine learning models, specifically Voting Ensembles, for the goal of predicting diabetes. Specifically, the researchers were interested in how accurate these models were. Using GridSearchCV, the Voting Ensemble, which consists of LightGBM, XGBoost, and AdaBoost, is fine-tuned to manage outliers. This may be done with or without the Interquartile Range (IQR) pre-processing. The results of a comparative analysis of performance, which is carried out, illustrate the benefits that are linked with outlier management. According to the findings, the Voting Ensemble model, when paired with IQR pre-processing, possesses greater accuracy, precision, and AUC score, which makes it more acceptable for predicting diabetes. Despite this, the strategy that does not use the IQR continues to be a workable and reasonable alternative. The current study emphasizes both the significance of outlier management within the area of healthcare analytics and the effect of data preparation procedures on the accuracy of prediction models. Both of these topics are brought up because of the relevance of the current work.

Keywords :

GridSearchCV; Interquartile Range; Voting Ensemble model; LightGBM; XGBoost; AdaBoost.

References :

[1]    Jaiswal, Varun, Anjli Negi, and Tarun Pal. “A Review on Current Advances in Machine Learning Based Diabetes Prediction.” Primary Care Diabetes 15, no. 3 (June 2021): 435–43. https://doi.org/10.1016/j.pcd.2021.02.005.

[2]    Krishna, T. B. M., Praveen, S. P., Ahmed, S., & Srinivasu, P. N. (2023). Software-driven secure framework for mobile healthcare applications in IoMT. Intelligent Decision Technologies, 17(2), 377-393.

[3]    “Machine Learning Techniques for Screening and Diagnosis of Diabetes: A Survey.” Tehnicki Vjesnik - Technical Gazette 26, no. 3 (June 2019). https://doi.org/10.17559/tv-20190421122826.

[4]    Kavakiotis, Ioannis, Olga Tsave, Athanasios Salifoglou, Nicos Maglaveras, Ioannis Vlahavas, and Ioanna Chouvarda. “Machine Learning and Data Mining Methods in Diabetes Research.” Computational and Structural Biotechnology Journal 15 (2017): 104–16. https://doi.org/10.1016/j.csbj.2016.12.005.

[5]    Phani Praveen, S., Hasan Ali, M., Musa Jaber, M., Buddhi, D., Prakash, C., Rani, D. R., & Thirugnanam, T. (2023). IoT-Enabled Healthcare Data Analysis in Virtual Hospital Systems Using Industry 4.0 Smart Manufacturing. International Journal of Pattern Recognition and Artificial Intelligence, 37(02), 2356002.

[6]    Ashish Dixit,R. P. Aggarwal,B. K. Sharma,Aditi Sharma. "Safeguarding Digital Essence: A Sub-band DCT Neural Watermarking Paradigm Leveraging GRNN and CNN for Unyielding Image Protection and Identification." Journal of Intelligent Systems and Internet of Things, Vol. 10, No. 1, 2023 ,PP. 33-47.

[7]    Neyda Hernández Bandera,Jenny M. Moya Arizaga,Enrique Rodríguez Reyes. "Assessment and prediction of Chronic Kidney using an improved neutrosophic artificial intelligence model." International Journal of Neutrosophic Science, Vol. 21, No. 1, 2023 ,PP. 174-183.

[8]    Doaa Sami Khafaga,Abdelhameed Ibrahim,S. K. Towfek,Nima Khodadadi. "Data Mining Techniques in Predictive Medicine: An Application in hemodynamic prediction for abdominal aortic aneurysm disease." Journal of Artificial Intelligence and Metaheuristics, Vol. 5, No. 1, 2023 ,PP. 29-37.

[9]    Ahmed, Usama, Ghassan F. Issa, Muhammad Adnan Khan, Shabib Aftab, Muhammad Farhan Khan, Raed A. T. Said, Taher M. Ghazal, and Munir Ahmad. “Prediction of Diabetes Empowered With Fused Machine Learning.” IEEE Access 10 (2022): 8529–38. https://doi.org/10.1109/access.2022.3142097.

[10] Reem Atassi,Aditi Sharma. "An Efficient and Secured Triple-Layered Wireless Sensor Network with Machine Learning Techniques." International Journal of Wireless and Ad Hoc Communication, Vol. 6, No. 2, 2023 ,PP. 08-17.

[11] Gajender Kumar,Vinod Patidar,Prolay Biswas,Mukta Patel,Chaur Singh Rajput,Anita Venugopal,Aditi Sharma. "IOT enabled Intelligent featured imaging Bone Fractured Detection System." Journal of Intelligent Systems and Internet of Things, Vol. 9, No. 2, 2023 ,PP. 08-22.

[12] Ljubic, Branimir, Ameen Abdel Hai, Marija Stanojevic, Wilson Diaz, Daniel Polimac, Martin Pavlovski, and Zoran Obradovic. “Predicting Complications of Diabetes Mellitus Using Advanced Machine Learning Algorithms.” Journal of the American Medical Informatics Association 27, no. 9 (September 1, 2020): 1343–51. https://doi.org/10.1093/jamia/ocaa120.

[13] A. Yuva Krishna,K. Ravi Kiran,N. Raghavendra Sai,Aditi Sharma,S. Phani Praveen,Jitendra Pandey. (2023). Ant Colony Optimized XGBoost for Early Diabetes Detection: A Hybrid Approach in Machine Learning. Journal of Intelligent Systems and Internet of Things, 10 ( 2 ), 76-89.

[14] Chang, Victor, Meghana Ashok Ganatra, Karl Hall, Lewis Golightly, and Qianwen Ariel Xu. “An Assessment of Machine Learning Models and Algorithms for Early Prediction and Diagnosis of Diabetes Using Health Indicators.” Healthcare Analytics 2 (November 2022): 100118. https://doi.org/10.1016/j.health.2022.100118.

[15] K. Arava, C. Paritala, V. Shariff, S. P. Praveen and A. Madhuri, "A Generalized Model for Identifying Fake Digital Images through the Application of Deep Learning," 2022 3rd International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2022, pp. 1144-1147, doi: 10.1109/ICESC54411.2022.9885341.

[16] Reddy, Shiva, Nilambar Sethi, R. Rajender, and Gadiraju Mahesh. “Forecasting Diabetes Correlated Non-Alcoholic Fatty Liver Disease by Exploiting Naïve Bayes Tree.” ICST Transactions on Scalable Information Systems, July 13, 2018, 173975. https://doi.org/10.4108/eai.29-4-2022.173975.

[17] S. P. Praveen, S. Sindhura, P. N. Srinivasu and S. Ahmed, "Combining CNNs and Bi-LSTMs for Enhanced Network Intrusion Detection: A Deep Learning Approach," 2023 3rd International Conference on Computing and Information Technology (ICCIT), Tabuk, Saudi Arabia, 2023, pp. 261-268, doi: 10.1109/ICCIT58132.2023.10273871.

[18] Ashish Patel,Richa Mishra ,Aditi Sharma. "Maize Plant Leaf Disease Classification Using Supervised Machine Learning Algorithms." Fusion: Practice and Applications, Vol. 13, No. 2, 2023 ,PP. 08-21.

[19] B. V. Marrapu, K. Y. N. Raju, M. J. Chowdary, H. Vempati and S. Phani Praveen, "Automating the Creation of Machine Learning Algorithms using basic Math," 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 2022, pp. 866-871, doi: 10.1109/ICSSIT53264.2022.9716270.

[20] Mangkunegara, Iis Setiawan, and Purwono Purwono. “Analysis of DNA Sequence Classification Using SVM Model with Hyperparameter Tuning Grid Search CV.” 2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), June 16, 2022. https://doi.org/10.1109/cyberneticscom55287.2022.9865624.

[21] Shehadeh, Ali, Odey Alshboul, Rabia Emhamed Al Mamlook, and Ola Hamedat. “Machine Learning Models for Predicting the Residual Value of Heavy Construction Equipment: An Evaluation of Modified Decision Tree, LightGBM, and XGBoost Regression.” Automation in Construction 129 (September 2021): 103827. https://doi.org/10.1016/j.autcon.2021.103827.

[22] Rufo, Derara Duba, Taye Girma Debelee, Achim Ibenthal, and Worku Gachena Negera. “Diagnosis of Diabetes Mellitus Using Gradient Boosting Machine (LightGBM).” Diagnostics 11, no. 9 (September 19, 2021): 1714. https://doi.org/10.3390/diagnostics11091714.

[23] Sirisha, U., & Chandana, B. S. (2023). Privacy preserving image encryption with optimal deep transfer learning based accident severity classification model. Sensors, 23(1), 519.

[24] Sirisha, U., & Chandana, B. S. (2023). Utilizing a Hybrid Model for Human Injury Severity Analysis in Traffic Accidents. Traitement du Signal, 40(5).

[25] B. Narasimha Swamy,Rajeswari Nakka,Aditi Sharma,S. Phani Praveen,Venkata Nagaraju Thatha,Kumar Gautam. (2023). An Ensemble Learning Approach for detection of Chronic Kidney Disease (CKD). Journal of Intelligent Systems and Internet of Things, 10 ( 2 ), 38-48.

[26] Bahad, Pritika, and Preeti Saxena. “Study of AdaBoost and Gradient Boosting Algorithms for Predictive Analytics.” International Conference on Intelligent Computing and Smart Communication 2019, December 20, 2019, 235–44. https://doi.org/10.1007/978-981-15-0633-8_22.

[27] Haq, Amin Ul, Jian Ping Li, Jalaluddin Khan, Muhammad Hammad Memon, Shah Nazir, Sultan Ahmad, Ghufran Ahmad Khan, and Amjad Ali. “Intelligent Machine Learning Approach for Effective Recognition of Diabetes in E-Healthcare Using Clinical Data.” Sensors 20, no. 9 (May 6, 2020): 2649. https://doi.org/10.3390/s20092649.


Cite this Article as :
Style #
MLA S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan. "Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study." Journal of Intelligent Systems and Internet of Things, Vol. 12, No. 1, 2024 ,PP. 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)
APA S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan. (2024). Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study. Journal of Journal of Intelligent Systems and Internet of Things, 12 ( 1 ), 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)
Chicago S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan. "Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study." Journal of Journal of Intelligent Systems and Internet of Things, 12 no. 1 (2024): 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)
Harvard S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan. (2024). Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study. Journal of Journal of Intelligent Systems and Internet of Things, 12 ( 1 ), 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)
Vancouver S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan. Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study. Journal of Journal of Intelligent Systems and Internet of Things, (2024); 12 ( 1 ): 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)
IEEE S. Phani Praveen, Kotte Sandeep, N. Raghavendra Sai, Aditi Sharma, Jitendra Pandey, Vikas Chouhan, Outlier Management and its Impact on Diabetes Prediction: A Voting Ensemble Study, Journal of Journal of Intelligent Systems and Internet of Things, Vol. 12 , No. 1 , (2024) : 08-19 (Doi   :  https://doi.org/10.54216/JISIoT.120101)