Volume 1 , Issue 2 , PP: 72-79, 2020 | Cite this article as | XML | Html | PDF | Full Length Article
Ibrahim El-Henawy 1 , Marwa Abo-Elazm 2
Arabic is one of the phonetically complex languages, and the creation of accurate speech recognition system is a challengeable task. Phonetic dictionary is essential component in automatic speech recognition system (ASR). The pronunciation variations in Arabic are tangible and are investigated widely using data driven approach or knowledge based approach. The phonological rules are used to get the pronunciation of each word accurately to reduce the mismatch between the actual phoneme representation of the spoken words and ASR dictionary. Several studies in Arabic ASR system are conducted using different number of phonological rules. In this paper we focus on those rule that handle within-word pronunciation variation and cross-word pronunciation variation. The experimental results indicate that handling within-word pronunciation variation using phonological rule doesn’t enhance the recognition performance, but using these rules to handle cross-word variation provide a good performance.
Speech Recognition Systems, Arabic Language, Phonetic Dictionary, pronunciation variations
[1] Elmahdy et al. used acoustic models trained with large MSA news broadcast speech corpus to work as multilingual or multi-accent models to decode colloquial Arabic(2009).
[2] Al–Anzi Fawaz S, AbuZeina Dia, "The impact of phonological rules on Arabic speech recognition", International Journal of Speech Technology, vol. 20, no. 3, pp. 715-723, 2017.
[3] Abed, S., Alshayeji, M. and Sultan, S. 2019. Diacritics Effect on Arabic Speech Recognition. Arabian Journal for Science and Engineering. (2019).
[4] Abuzeina, D., Al-Khatib, W., Elshafei, M., Al-Muhtaseb, H., 2011. Cross-word Arabic pronunciation variation modeling for speech recognition.Int. J. Speech Technol. 14 (3), 227–236.
[5] Fosler-Lussier, E., Greenberg, S., Morgan, N., et al., 1999. Incorporating contextual phonetics into automatic speech recognition. Nucleus 48993(65.3), 62118.
[6] Ramsay, A., Alsharhan, I., Ahmed H. (2014). Generation of a phonetic transcription for modern standard Arabic: A knowledge based model. Computer Speech & Language, 28(4), 959–978.
[7] Amdal, I., Fosler-Lussier, E., 2003. Pronunciation variation modeling in automatic speech recognition. Telektronikk 99 (2), 70–82.
[8]Wester, M., Fosler-Lussier, E., 2000. A comparison of data-derived and knowledge-based modeling of pronunciation variation.
[9] Helmer, S. (2001). Pronunciation adaptation at the lexical level. In Proceedings ISCA ITRW workshop adaptation methods for speech recognition, Sophia Antipolis, France.
[10] Ali, M., Moustafa, E., Mansour, A., Husni, A., & Atef, A. (2009). Arabic phonetic dictionaries for speech recognition. Journal of Information Technology Research, 2(4), 67–80.
[11] Algamdi, M., Almuhtasib, H., & Elshafei, M. (2004).Arabic Phonological Rules. [King Saud University.].Journal of Computer Sciences and Information,16, 1–25.
[12] Elshafei-Ahmed, M. (1991). Toward an Arabic Text-to-Speech System. The Arabian Journal of Scienceand Engineering, 16(4B), 565–583.
[13] Billa et al. (2002). Arabic speech and test in tides on tap. In Proceedings of HLT.
[14] Alghamdi, M., Elshafei, M., & Almuhtasib, H. (2009). Arabic broadcast news transcription system. International Journal of Speech and Technology, 10, 183–195.
[15 ] Masmoudi, A., et al. (2014) A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition. In LREC.
[16] Al-Haj, H., Hsiao, R., Lane, I., Black, W. A., & Waibel, A. (2009).Pronunciation modeling for dialectal Arabic speech recognition.In ASRU 2009: IEEE workshop, Italy.