Volume 5 , Issue 1 , PP: 28–36, 2026 | Cite this article as | XML | Html | PDF | Full Length Article
Ahmed Abd El-Badie Abd Allah Kamel 1 *
Doi: https://doi.org/10.54216/IJAIET.050103
The academic success of students who are nearing academic failure should be Identifying students who are at risk of academic failure or course withdrawal at an early stage of their enrolment remains one of the most pressing challenges in higher and distance education. The research assesses the performance of seven machine learning classifiers which include Logistic Regression Decision Tree Random Forest Gradient Boosting Decision Tree (GBDT) AdaBoost Naive Bayes and Multilayer Perceptron for predicting student risk at an early stage based on a behavioural and demographic dataset derived from the Open University Learning Analytics Dataset (OULAD). The dataset contains 7895 student records which represent a single module and show eight demographic factors together with eight Virtual Learning Environment (VLE) usage patterns. All classifiers were evaluated through five-fold stratified cross-validation. The GBDT model achieved the best results with an AUC-ROC value of 0.782 (} 0.003) and an accuracy rate of 0.708 (} 0.005) which produced an F1 score of 0.729 (} 0.006) and a recall rate of 0.769 (} 0.006). The analysis of feature importance showed that late sub-mission count (I = 0.304) and total VLE clicks (I = 0.150) together with first assessment score (I = 0.135) serve as the three most valuable predictive indicators because they help identify student engagement patterns which become evident through VLE traces that educational institutions collect from students during their first module. Educational institutions can utilize learning management system data to implement effective combi-nation methods which enable them to execute necessary teaching methods even though they do not need to gather additional expense data. The article presents design elements which both create early warning systems and manage the ethical use of predictive analytics within educational systems.
Learning analytics , Student at-risk prediction , Gradient boosting , Ensemble machine learning , Virtual learning environment , Educational data mining , Early warning systems
[1] Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A. A., Abid, M., . . . Khan, S. U. (2021). Predicting at-risk students at different percentages of course length for early intervention using machine learning models. IEEE Access, 9, 7519–7539. doi: 10.1109/ACCESS.2021.3049446
[2] Batool, S., Rashid, J., Nisar, M.W., Kim, J., Kwon, H.-Y., & Hussain, A. (2023). Educational data mining to predict students’ academic performance: A survey study. Education and Information Technologies, 28(1), 905–971. doi: 10.1007/s10639-022-11152-y
[3] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi: 10.1023/A:1010933404324
[4] Feng, G., Fan, M., & Chen, Y. (2022). Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access, 10, 19558–19571. doi: 10.1109/ACCESS.2022.3151652
[5] Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. doi: 10.1214/aos/1013203451
[6] Khan, A., & Ghosh, S. K. (2021). Student performance analysis and prediction in classroom learning: A review of educational data mining studies. Education and Information Technologies, 26(1), 205–240. doi: 10.1007/s10639-020-10230-3
[7] Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Open University learning analytics dataset. Scientific Data, 4, 170171. doi: 10.1038/sdata.2017.171
[8] Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (Vol. 30, pp. 4765–4774). Curran Associates, Inc.
[9] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . Duchesneau, ´ E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825– 2830.
[10] Waheed, H., Hassan, S.-U., Nawaz, R., Aljohani, N. R., Chen, G., & Gasevic, D. (2023). Early prediction of learners at risk in self-paced education: A neural network approach. Expert Systems with Applications, 213, 118868. doi: 10.1016/j.eswa.2022.118868
[11] Zawacki-Richter, O., Mar´ın, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education – where are the educators? International Journal of Educational Technology in Higher Education, 16(1), 39. doi: 10.1186/s41239-019-0171-0