Journal of Intelligent Systems and Internet of Things

Journal DOI

https://doi.org/10.54216/JISIoT

Submit Your Paper

2690-6791ISSN (Online) 2769-786XISSN (Print)

Volume 13 , Issue 1 , PP: 177-195, 2024 | Cite this article as | XML | Html | PDF | Full Length Article

An Ensemble Machine Learning Method for Analyzing Various Medical Datasets

Chhaya Gupta 1 , Nasib Singh Gill 2 , Priti Maheshwary 3 , Shraddha V. Pandit 4 , Preeti Gulia 5 , Piyush Kumar Pareek 6 *

  • 1 Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak, Haryana, India - (chhaya.rs.dcsa@mdurohtak.ac.in)
  • 2 Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak, Haryana, India - (Nasib.gill@mdurohtak.ac.in)
  • 3 Rabindranath Tagore University, Bhopal, India - (pritimaheshwary@gmail.com)
  • 4 Department of Artificial Intelligence and Data Science, PES Modern College of Engineering, Shivajinagar, Pune, India - (shraddha.pandit@moderncoe.edu.in)
  • 5 Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak, Haryana, India - (preeti@mdurohtak.ac.in)
  • 6 Professor and Head Department of AIML and IPR Cell, Nitte Meenakshi Institute of Technology, Bengaluru, India - (piyush.kumar@nmit.ac.in)
  • Doi: https://doi.org/10.54216/JISIoT.130114

    Received: September 22, 2023 Revised: January 19, 2024 Accepted: June 13, 2024
    Abstract

    In recent years, machine learning (ML) has shown a significant impact in tackling various complicated problems in different application domains, including healthcare, economics, ecological, stock market, surveillance, and commercial applications. Machine Learning techniques are good enough to deal with a wide range of data, uncover fascinating links, offer insights, and spot trends. ML can improve disease diagnosis accuracy, predictability, performance, and reliability. This paper reviews various machine learning techniques applied to different medical datasets and proposes an ensemble method for helping in the early diagnosis of different diseases. The study compares existing machine learning techniques with the proposed ensemble method. The ensemble method uses the AdaBoost algorithm to combine the traits of choice trees, random forests, and support vector machines. Three feature selection techniques, Fisher’s score, information gain, and genetic algorithm, are used to select appropriate dataset features. The ensemble method also uses the K-fold cross-validation technique (where k=15) for validating results. SMOTE was employed to balance some of the datasets because they were quite unbalanced. All the methods used in this study are evaluated based on accuracy, AU Curve, Recall, Precision, and F1-score. The paper uses different medical datasets at the University of California Irvine and the Kaggle directory to compare machine-learning models with the proposed ensemble method. The encouraging results show that the ensemble method outperforms the existing machine-learning techniques. The paper thoroughly analyzes how machine learning is used in the medical industry, covering established technologies and their impact on medical diagnosis. An early diagnosis is needed to prevent people from deadly diseases. Hence, this study proposes an ensemble method that may be used to diagnose different diseases early.

    Keywords :

    Choice Tree Classifier , Ensemble Classifier , KNN Classifier , Naï , ve Bayes Classifier , Random Forest Classifier , Synthetic Minority Oversampling Technique

    References

    [1]    M. Shehab et al., “Machine learning in medical applications: A review of state-of-the-art methods,” Comput. Biol. Med., vol. 145, no. November 2021, 2022, doi: 10.1016/j.compbiomed.2022.105458.

    [2]    S. David, J. Andrew, K. Martin Sagayam, and A. A. Elngar, “Augmenting security for electronic patient health record (ePHR) monitoring system using cryptographic key management schemes,” Fusion Pract. Appl., vol. 5, no. 2, pp. 51–61, 2021, doi: 10.54216/FPA.050201.

    [3]    M. Chen, Y. Hao, K. Hwang, L. Wang, and L. Wang, “Disease Prediction by Machine Learning over Big Data from Healthcare Communities,” IEEE Access, vol. 5, pp. 8869–8879, 2017, doi: 10.1109/ACCESS.2017.2694446.

    [4]    J. Abed Eleiwy and N. Jaafar, “Novel Filter of DWT for Image Processing Applications,” Fusion Pract. Appl., vol. 4, no. 2, pp. 32–41, 2021, doi: 10.54216/FPA.040205.

    [5]    A. G, A. Kumar, and V. R, “Query-Based Image Retrieval using Support Vector Machine (SVM),” J. Cogn. Human-Computer Interact., vol. 1, no. 1, pp. 28–36, 2021, doi: 10.54216/jchci.010104.

    [6]    G. Akhila, H. K, and J. R. Jaramillo, “Indian Premier League Using Different Aspects of Machine Learning Algorithms,” J. Cogn. Human-Computer Interact., vol. 1, no. 1, pp. 01–07, 2021, doi: 10.54216/jchci.010101.

    [7]    M. Ramzan, “Comparing and evaluating the performance of WEKA classifiers on critical diseases,” India Int. Conf. Inf. Process. IICIP 2016 - Proc., pp. 1–4, 2017, doi: 10.1109/IICIP.2016.7975309.

    [8]    B. V. Ramana and R. S. Kumar Boddu, “Performance comparison of classification algorithms on medical datasets,” 2019 IEEE 9th Annu. Comput. Commun. Work. Conf. CCWC 2019, pp. 140–145, 2019, doi: 10.1109/CCWC.2019.8666497.

    [9]    E. Alexopoulos1, G. D. Dounias, and K. Vemmos, “Medical diagnosis of stroke using inductive machine learning,” Mach. Learn. Appl. Mach. Learn. Med. Appl., no. September 1999, pp. 20–23, 1999, [Online].
    Available: http://www.researchgate.net/publication/2819899_Medical_Diagnosis_Of_Stroke_Using_Inductive_
    Machine_Learning/file/9fcfd51407a635db88.pdf

    [10]  G. W.H.S.D, “Performance Evaluation on Machine Learning Classification Techniques for Disease (CKD),” Ieee, pp. 291–296, 2017, doi: 10.1109/BIBE.2017.00056.

    [11]  M. Nilashi, O. bin Ibrahim, H. Ahmadi, and L. Shahmoradi, “An analytical method for diseases prediction using machine learning techniques,” Comput. Chem. Eng., vol. 106, pp. 212–223, 2017, doi: 10.1016/j.compchemeng.2017.06.011.

    [12]  A. Cüvitoğlu and Z. Işik, “Evaluation machine-learning approaches for classification of Cryotherapy and Immunotherapy datasets,” Int. J. Mach. Learn. Comput., vol. 8, no. 4, pp. 331–335, 2018, doi: 10.18178/ijmlc.2018.8.4.707.

    [13]  A. Garg and V. Mago, “Role of machine learning in medical research: A survey,” Comput. Sci. Rev., vol. 40, p. 100370, May 2021, doi: 10.1016/J.COSREV.2021.100370.

    [14]  A. Aada and S. Tiwari, “Predicting Diabetes in Medical Datasets Using Machine Learning Techniques,” Int. J. Sci. Res. Eng. Trends, vol. 5, no. 2, pp. 257–267, 2019.

    [15]  R. G. Nadakinamani et al., “Clinical Data Analysis for Prediction of Cardiovascular Disease Using Machine Learning Techniques,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/2973324.

    [16]  P. Jabbari and N. Rezaei, “Artificial intelligence and immunotherapy,” Expert Rev. Clin. Immunol., vol. 15, no. 7, pp. 689–691, 2019, doi: 10.1080/1744666X.2019.1623670.

    [17]  E. H. Houssein, E. Saber, Y. M. Wazery, and A. A. Ali, “Swarm Intelligence Algorithms-Based Machine Learning Framework for Medical Diagnosis: A Comprehensive Review,” Stud. Comput. Intell., vol. 1038, pp. 85–106, 2022, doi: 10.1007/978-3-030-99079-4_4/COVER.

    [18]  A. Saboor, M. Usman, S. Ali, A. Samad, M. F. Abrar, and N. Ullah, “A Method for Improving Prediction of Human Heart Disease Using Machine Learning Algorithms,” Mob. Inf. Syst., vol. 2022, 2022, doi: 10.1155/2022/1410169.

    [19]  M. W. Floyd, J. T. Turner, and D. W. Aha, “Using deep learning to automate feature modeling in learning by observation,” FLAIRS 2017 - Proc. 30th Int. Florida Artif. Intell. Res. Soc. Conf., no. June, pp. 50–55, 2017.

    [20]  A. Alanazi, “Using machine learning for healthcare challenges and opportunities,” Informatics Med. Unlocked, vol. 30, no. March, p. 100924, 2022, doi: 10.1016/j.imu.2022.100924.

    [21]  A. Pan, S. Mukhopadhyay, and S. Samanta, “Liver Disease Detection,” Int. J. Healthc. Inf. Syst. Informatics, vol. 17, no. 2, pp. 1–19, 2022, doi: 10.4018/ijhisi.299956.

    [22]  S. Mall, A. Srivastava, B. D. Mazumdar, M. Mishra, S. L. Bangare, and A. Deepak, “Implementation of machine learning techniques for disease diagnosis,” Mater. Today Proc., vol. 51, pp. 2198–2201, 2022, doi: 10.1016/j.matpr.2021.11.274.

    [23]  P. Dinesh, A. S. Vickram, and P. Kalyanasundaram, “Medical image prediction for diagnosis of breast cancer disease comparing the machine learning algorithms: SVM, KNN, logistic regression, random forest and decision tree to measure accuracy,” FIFTH Int. Conf. Appl. Sci. ICAS2023, vol. 3097, no. 1, p. 020140, May 2024, doi: 10.1063/5.0203746/3290220.

    [24]  M. S. Singh, K. Thongam, P. Choudhary, and P. K. Bhagat, “An Integrated Machine Learning Approach for Congestive Heart Failure Prediction,” Diagnostics, vol. 14, no. 7, pp. 1–21, 2024, doi: 10.3390/diagnostics14070736.

    [25]  A. N. Al Masri and H. Mokayed, “An Efficient Machine Learning based Cervical Cancer Detection and Classification,” J. Cybersecurity Inf. Manag., vol. 2, no. 2, pp. 58–67, 2020, doi: 10.54216/jcim.020203.

    [26]  A. Abdelhafeez and H. K. Mohamed, “Skin cancer detection using neutrosophic c-means and fuzzy c-means clustering algorithms,” J. Intell. Syst. Internet Things, vol. 8, no. 1, pp. 33–42, 2023, doi: 10.54216/JISIoT.080103.

    [27]  L. Sun, T. Wang, W. Ding, J. Xu, and Y. Lin, “Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification,” Inf. Sci. (Ny)., vol. 578, pp. 887–912, 2021, doi: 10.1016/j.ins.2021.08.032.

    [28]  Abdelrahim Koura and H. S. Elnashar, “Data Mining Algorithms for Kidney Disease Stages Prediction,” J. Cybersecurity Inf. Manag., vol. 1, no. 1, pp. 21–29, 2020, doi: 10.54216/jcim.010104.

    [29]  B. N. Swamy, R. Nakka, A. Sharma, S. P. Praveen, V. N. Thatha, and K. Gautam, “An Ensemble Learning Approach for detection of Chronic Kidney Disease (CKD),” J. Intell. Syst. Internet Things, vol. 10, no. 2, pp. 38–48, 2023, doi: 10.54216/JISIoT.100204.

    [30]  R. Lamba, T. Gulati, H. F. Alharbi, and A. Jain, “A hybrid system for Parkinson’s disease diagnosis using machine learning techniques,” Int. J. Speech Technol., vol. 25, no. 3, pp. 583–593, 2022, doi: 10.1007/s10772-021-09837-9.

    Cite This Article As :
    Gupta, Chhaya. , Singh, Nasib. , Maheshwary, Priti. , V., Shraddha. , Gulia, Preeti. , Kumar, Piyush. An Ensemble Machine Learning Method for Analyzing Various Medical Datasets. Journal of Intelligent Systems and Internet of Things, vol. , no. , 2024, pp. 177-195. DOI: https://doi.org/10.54216/JISIoT.130114
    Gupta, C. Singh, N. Maheshwary, P. V., S. Gulia, P. Kumar, P. (2024). An Ensemble Machine Learning Method for Analyzing Various Medical Datasets. Journal of Intelligent Systems and Internet of Things, (), 177-195. DOI: https://doi.org/10.54216/JISIoT.130114
    Gupta, Chhaya. Singh, Nasib. Maheshwary, Priti. V., Shraddha. Gulia, Preeti. Kumar, Piyush. An Ensemble Machine Learning Method for Analyzing Various Medical Datasets. Journal of Intelligent Systems and Internet of Things , no. (2024): 177-195. DOI: https://doi.org/10.54216/JISIoT.130114
    Gupta, C. , Singh, N. , Maheshwary, P. , V., S. , Gulia, P. , Kumar, P. (2024) . An Ensemble Machine Learning Method for Analyzing Various Medical Datasets. Journal of Intelligent Systems and Internet of Things , () , 177-195 . DOI: https://doi.org/10.54216/JISIoT.130114
    Gupta C. , Singh N. , Maheshwary P. , V. S. , Gulia P. , Kumar P. [2024]. An Ensemble Machine Learning Method for Analyzing Various Medical Datasets. Journal of Intelligent Systems and Internet of Things. (): 177-195. DOI: https://doi.org/10.54216/JISIoT.130114
    Gupta, C. Singh, N. Maheshwary, P. V., S. Gulia, P. Kumar, P. "An Ensemble Machine Learning Method for Analyzing Various Medical Datasets," Journal of Intelligent Systems and Internet of Things, vol. , no. , pp. 177-195, 2024. DOI: https://doi.org/10.54216/JISIoT.130114