Volume 14 , Issue 2 , PP: 260-273, 2025 | Cite this article as | XML | Html | PDF | Full Length Article
Yahia Menassel 1 * , Rashiq Rafiq Marie 2 , Faycel Abbas 3 , Abdeljalil Gattal 4 , Mohammed Al-Sarem 5
Doi: https://doi.org/10.54216/JISIoT.140220
Script identification is crucial for document analysis and optical character recognition (OCR). This study proposes YafNet, a novel convolutional neural network (CNN) architecture, developed from scratch, to tackle the challenges of script identification in both handwritten and printed word images. YafNet dynamically weights features, enabling it to learn and combine multimodal features for accurate script identification. To evaluate its efficacy, we use the imbalanced ICDAR 2021 Script Identification in the Wild (SIW 2021) competition dataset. Experimental results demonstrate that YafNet outperforms conventional approaches, particularly when trained on mixed handwritten and printed data. It achieves high classification accuracy, balanced accuracy, and ROC AUC scores, indicating its robustness and generalizability. The incorporation of data augmentation and external data further enhances performance, underscoring the model's potential for real-world applications.
Script identification , YafNet , CNN , Imbalanced dataset
[1] Naosekpam, V. & Sahu, N., Text detection, recognition, and script identification in natural scene images: a Review. International Journal Of Multimedia Information Retrieval, 11(3), 291‑314, 2022. https://doi.org/10.1007/s13735-022-00243-8
[2] Ubul, K., Tursun, G., Aysa, A., Impedovo, D., Pirlo, G., & Yibulayin, I., Script Identification of Multi-Script Documents: A Survey. IEEE Access, vol. 5, 6546-6559, 2017. https://doi.org/10.1109/access.2017.2689159.
[3] Das, A., Ferrer, M. A., Morales, A., Diaz, M., Pal, U., Impedovo, D., Li, H., Yang, W., Ota, K., Yao, T., Hung, L. Q., Cuong, N. Q., Kim, S., & Gattal, A., ICDAR 2021 Competition on Script Identification in the Wild. In Lecture notes in computer science, 738–753, 2021. https://doi.org/10.1007/978-3-030-86337-1_49
[4] Pati, P. B. & Ramakrishnan, A., Word level multi-script identification. Pattern Recognition Letters, 29(9), 1218–1229, 2008. https://doi.org/10.1016/j.patrec.2008.01.027
[5] Singh, A. K., Mishra, A., Dabral, P., & Jawahar, C. V., A Simple and Effective Solution for Script Identification in the Wild. 2016 12th IAPR Workshop on Document Analysis Systems (DAS), 428-433, Santorini, Greece, 2016. https://doi.org/10.1109/DAS.2016.57.
[6] Boudraa, M., Bennour, A., Al-Sarem, M., Ghabban, F., & Bakhsh, O. A., Contribution to historical manuscript dating: A hybrid approach employing hand-crafted features with vision transformers. Digital Signal Processing, 149, 104477, 2024. https://doi.org/10.1016/j.dsp.2024.104477
[7] Shi, B., Bai, X., & Yao, C., Script identification in the wild via discriminative convolutional neural network. Pattern Recognition, vol. 52, 448–458, 2016. https://doi.org/10.1016/j.patcog.2015.11.005
[8] Bhunia, A. K., Konwer, A., Bhunia, A. K., Bhowmick, A., Roy, P. P., & Pal, U., Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network. Pattern Recognition, vol. 85, 172–184, 2019. https://doi.org/10.1016/j.patcog.2018.07.034
[9] Khalil, A., Jarrah, M., Al-Ayyoub, M., & Jararweh, Y., Text detection and script identification in natural scene images using deep learning. Computers & Electrical Engineering, vol. 91, 107043, 2021. https://doi.org/10.1016/j.compeleceng.2021.107043
[10] Zhang, Z., Eli, E., Mamat, H., Aysa, A., & Ubul, K., EA-CONVNEXT: An approach to script identification in natural scenes based on edge flow and coordinate attention. Electronics, 12(13), 2837, 2023. https://doi.org/10.3390/electronics12132837
[11] Li, X., Zhan, H., Shivakumara, P., Pal, U., & Lu, Y., SANet-SI: A new Self-Attention-Network for Script Identification in scene images. Pattern Recognition Letters, vol. 171, 45–52, 2023. https://doi.org/10.1016/j.patrec.2023.04.015
[12] Peng, F., Ma, H., Liu, L., Lu, Y., & Suen, C. Y., Adaptive feature fusion for scene text script identification. Multimedia Tools and Applications, 83(23), 62677–62699, 2024. https://doi.org/10.1007/s11042-023-17986-z
[13] Gupta, M. K., Dhawan, S., & Kumar, A., Document Image Script Identification using Deep Network. 2024 11th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 174-179, 2024. https://doi.org/10.1109/SPIN60856.2024.10511557
[14] Ferrer, M. A., Das, A., Diaz, M., Morales, A., Carmona-Duarte, C., & Pal, U., MDIW-13: a New Multi-Lingual and Multi-Script Database and Benchmark for Script Identification. Cognitive Computation, 16(1), 131–157, 2024. https://doi.org/10.1007/s12559-023-10193-w
[15] Jindal, A., Script identification in handwritten and printed documents using convolutional recurrent connection. Multimedia Tools and Applications, 2024. https://doi.org/10.1007/s11042-024-19106-x
[16] Boudraa, M., Bennour, A., Mekhaznia, T., Alqarafi, A., Marie, R. R., Al-Sarem, M., & Dogra, A., Revolutionizing Historical Manuscript Analysis: A Deep Learning Approach with Intelligent Feature Extraction for Script Classification. Acta Informatica Pragensia, 13(2), 251–272, 2024. https://doi.org/10.18267/j.aip.239
[17] Abbas, F., Gattal, A., & Menassel, R. Local binary pattern and its derivatives to handwriting-based gender classification. Bulletin of Electrical Engineering and Informatics, 12(6), 3571-3583, 2023. https://doi.org/10.11591/eei.v12i6.5488
[18] Gattal, A. & Abbas, F., Isolated handwritten digit recognition using LPQ and LBP features. In Proceedings of the 10th International Conference on Information Systems and Technologies, 1-5, 2020. https://doi.org/10.1145/3447568.3448465.
[19] Abbas, F., Gattal, A., Djeddi, C., Siddiqi, I., Bensefia, A., & Saoudi, K., Texture feature column scheme for single‐ and multi‐script writer identification. IET Biometrics, 10(2), 179–193, 2021. https://doi.org/10.1049/bme2.12010
[20] Zhang, Z., Mamat, H., Xu, X., Aysa, A., & Ubul, K., FAS-Res2Net: an improved ReS2Net-Based script identification method for natural scenes. Applied Sciences, 13(7), 4434, 2023. https://doi.org/10.3390/app13074434
[21] Obaidullah, S. M., Halder, C., Santosh, K. C., Das, N., & Roy, K., PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimedia Tools and Applications, 77(2), 1643–1678, 2018. https://doi.org/10.1007/s11042-017-4373-y
[22] Rahman, M. A., Tabassum, N., Paul, M., Pal, R., & Islam, M. K., BN-HTRD: a benchmark dataset for document level offline Bangla handwritten Text recognition (HTR) and line segmentation. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2206.08977