Volume 5 • Issue 1 • PP: 37–42 • 2026
A Systematic Review of AI-Powered Uzbek Short-Answer Grading Using NLP and Teacher-Annotated Datasets
Abstract
This paper presents a Systematic Literature Review (SLR) of AI-powered automated short-answer grading, with a particular focus on low-resource languages such as Uzbek. The review follows the PRISMA 2020 guidelines to ensure transparency and methodological rigor. Relevant peer-reviewed studies published between 2018 and 2025 were systematically identified, screened, and analyzed across multiple academic databases. In total, 33 studies were included in the final synthesis. The reviewed literature indicates that transformer-based models, including mBERT and XLM-R, generally achieve stronger performance than traditional machine learning approaches, while recent large language models show potential in few-shot and zero-shot grading scenarios. The findings also highlight that the limited availability of teacher-annotated datasets remains a major challenge for developing reliable automated grading systems in low-resource educational contexts.
Keywords
References
[1] S. Burrows, I. Gurevych, and B. Stein, “The eras and trends of automatic short answer grading,” International Journal of Artificial Intelligence in Education, vol. 25, no. 1, pp. 60–117, 2015. doi: 10.1007/s40593-014- 0026-8.
[2] M. Dzikovska, R. Nielsen, and C. Brew, “Automatic assessment of free text responses,” in Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, 2013, pp. 1–9. [Online]. Available: https://aclanthology.org/W13-1701/
[3] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of NAACL-HLT, 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423/
[4] A. Conneau et al., “Unsupervised cross-lingual representation learning at scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020. [Online]. Available: https://aclanthology.org/2020.acl-main.747/
[5] D. Alikaniotis, H. Yannakoudakis, and M. Rei, “Automatic text scoring using neural networks,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016. [Online]. Available: https://aclanthology.org/P16-1068/
[6] T. B. Brown et al., “Language models are few-shot learners,” Advances in Neural Information Processing Systems, vol. 33, 2020. [Online]. Available: https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb496 7418bfb8ac142f64a-Abstract.html
[7] M. J. Page et al., “The PRISMA 2020 statement: an updated guideline for reporting systematic reviews,” BMJ, vol. 372, p. n71, 2021. doi: 10.1136/bmj.n71.
[8] B. Riordan, A. Horbach, A. Cahill, and T. Zesch, “Investigating neural architectures for short answer scoring,” in Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, 2017. [Online]. Available: https://aclanthology.org/W17-5004/
[9] A. Horbach and T. Zesch, “A comparison of scoring short answers with human raters and automatic scoring,” in Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, 2014. [Online]. Available: https://aclanthology.org/W14-1703/
[10] K. Taghipour and H. T. Ng, “A neural approach to automated essay scoring,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016. [Online]. Available: https://aclanthology.org/D16-1193/
[11] Z. Ke and H. T. Ng, “Question answering for automatic short answer grading,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019. [Online]. Available: https://aclanthology.org/P19-1616/
[12] M. D. Shermis, “State-of-the-art automated essay scoring: Competition, results, and future directions,” Assessing Writing, 2014. doi: 10.1016/j.asw.2013.04.001.
[13] A. Lauscher and T. Zesch, “A neural network model for automatic short answer grading,” in Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, 2018. [Online]. Available: https://aclanthology.org/W18-0504/
[14] F. Dong and Y. Zhang, “Automatic short answer grading using text similarity,” Educational Technology & Society, 2017. [Online]. Available: https://www.jstor.org/stable/90014594
[15] J. Cheng et al., “A survey of automatic short answer grading,” IEEE Transactions on Learning Technologies, 2018. doi: 10.1109/TLT.2018.2852409.
Cite This Article
Choose your preferred format