413 235
Full Length Article
Volume 1 , Issue 1, PP: 21-29 , 2021


Data Mining Algorithms for Kidney Disease Stages Prediction

Authors Names :   Abdelrahim Koura   1 *     Hany S. Elnashar   2  

1  Affiliation :  Computer Science Dept., Faculty of Computers and Artificial Intelligent, Beni-Suef University, Egypt

    Email :  

2  Affiliation :  Faculty of Computers and Artificial Intelligent , Beni-Suef University, Egypt

    Email :  hsoliman@fcis.bsu.edu.eg

Doi   :  10.5281/zenodo.3686791

Abstract :

One of the most common health problems that correlated to serious complications is chronic kidney disease. Early detection and treatment can save it from progression. Machine learning is one tool that used historical data to improve future decision about prediction of chronic kidney disease.  The aim of this work is to compare the performance of six different models based on accuracy, sensitivity, precision, recall.  In this study, the experiments were conducted on 158 records downloaded from UCI repository. Six algorithms ( K-Nearest Neighbor, Naïve Bayes, Support Vector machine, Logistic Regression, Decision Tree, and Random Forest )  were implemented on data after preprocessing stage.   Evaluation of models resulted in Naïve Bayes and Random Forest accuracy 100%, Sensitivity 100%, Specificity 100%, precision 100 %, Recall 100% respectively. It is concluded that Naïve Bayes and Random Forest are better than other models.

Keywords :

Data mining , Kidney Disease(KD) ,  Feed Forward Neural Network; Levenberg-Marquardt; Multi-Layer Perceptron; Particle Swarm Optimization.

References :

[1] E. H. A. Rady and A. S. Anwar, “Prediction of kidney disease stages using data mining algorithms,” Informaticsin Medicine Unlocked, vol. 15. Elsevier Ltd, 01-Jan-2019, doi: 10.1016/j.imu.2019.100178.

[2]      W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector machine modeling forprediction of common diseases: The case of diabetes and pre- diabetes,” BMC Med. Inform. Decis. Mak., vol. 10, no. 1, 2010, doi: 10.1186/1472-6947- 10-16.

[3]      A. S. Levey et al., “National Kidney Foundation Practice Guidelines for Chronic Kidney Disease: Evaluation, Classification, and Stratification,” 2003.

[4]      A. K. Ahmed, S. Aljahdali, and S. Naimatullah Hussain, “Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques,” 2013.

[5]      R. A. Nebel et al., “Understanding the impact of sex and gender in Alzheimer’s disease: A call to action,”Alzheimer’s Dement., vol. 14, no. 9, pp. 1171–1183, 2018, doi: 10.1016/j.jalz.2018.04.008.

[6]      I. S. F. Dessai, “Intelligent Heart Disease Prediction System Using Probabilistic Neural

Network,” pp. 38–44, 2013.

[7]      Y. Cao, Z. De Hu, X. F. Liu, A. M. Deng, and C. J. Hu, “An MLP classifier for prediction of HBV-induced liver cirrhosis using routinely available clinical parameters,” Dis. Markers, vol. 35, no. 6, pp. 653–660, 2013, doi: 10.1155/2013/127962.

[8]      “Center for Machine Learning and Intelligent Systems | University of California, Irvine.” [Online]. Available: https://cml.ics.uci.edu/. [Accessed: 29-Dec-2019].

[9]      “International Statistical Classification of Diseases and Related Health Problems - World HealthOrganization - بتك Google.” [Online]. Available: https://books.google.com.eg/books?hl=ar&lr=&id=Tw5eAtsatiUC&oi=fnd&pg=PA1&ots

=o3e2k3qMlF&sig=l9GJ_GapBjCPM0libXs1EMp_jVM&redir_esc=y#v=onepage&q&f= false. [Accessed: 29-Dec-2019].

[10]    R. Subhashini and M. K. Jeyakumar, “OF-KNN Technique: An Approach for Chronic Kidney DiseasePrediction,” Int. J. Pure Appl. Math., vol. 116, no. 24, pp. 331–348, 2017.

[11]    “Develop k-Nearest Neighbors in Python From Scratch.” [Online]. Available: https://machinelearningmastery.com/tutorial-to-implement-k-nearest-neighbors-in-python- from-scratch/.[Accessed: 29-Dec-2019].

[12] S. H. Khan, S. H. Khan, J. Westin, and M. Dougherty, “DEGREE PROJECT Computer EngineeringProgramme Reg number Extent Predictive models for chronic renal disease using Decision trees, Naïve Bayes and Case-based methods,” 2010.

[13]    S. Agarwal, Data mining: Data mining concepts and techniques. 2014.

[14]    Salford Systems, “Random Forests for Beginners,” Salford Syst., p. 71, 2014.

[15]    M. Singh, P. K. Gupta, V. Tyagi, J. Flusser, T. I. Ören, and R. Kashyap, Advances in CompuSingh, M., Gupta, P. K., Tyagi, V., Flusser, J., Ören, T. I., & Kashyap, R. (n.d.). Advances in Computing and Data Sciences : Third International Conference, ICACDS 2019, Ghaziabad, India, April 12-13, 2019, Revised Selected Papers, Part. .

[16]    R. Nisbet, G. Miner, and K. Yale, “Chapter 9 - Classification,” Handb. Stat. Anal. Data Min. Appl., pp. 169–186, 2018, doi: 10.1016/B978-0-12-416632-5.00009-8.

[17]    Z. Wang and X. Xue, “Multi-class support vector machine,” in Support Vector Machines Applications, vol. 9783319023007, Springer International Publishing, 2013, pp. 23–48.

[18]    R. Gholami and N. Fakhari, “Support Vector Machine: Principles, Parameters, and Applications,” in Handbook of Neural Computation, Elsevier Inc., 2017, pp. 515–535.

[19]    N. Ben Amor, S. Benferhat, and Z. Elouedi, “Qualitative Classification with Possibilistic Decision Trees,” in Modern Information Processing, Elsevier, 2006, pp. 159–169.

[20]    A. Géron, “Hands-On Machine Learning with Scikit-Learn and TensorFlow.”

[21]    J. I. E. Hoffman, “Logistic Regression,” in Basic Biostatistics for Medical and Biomedical Practitioners, Elsevier, 2019, pp. 581–589.