A Hybrid Speech Recognition System Using Deep Learning Methods

Hadeel Luhaib Fouad; Husam Ali Abdulmohsin

doi:https://doi.org/10.54216/JISIoT.150109

A Hybrid Speech Recognition System Using Deep Learning Methods

Hadeel Luhaib Fouad ^{1
*} , Husam Ali Abdulmohsin ²

1 Computer science department, college of science, university of Baghdad, Iraq - (Hadeel.Fouad2301@sc.uobaghdad.edu.iq)

2 Computer science department, college of science, university of Baghdad, Iraq - (Husam.a@sc.uobaghdad.edu.iq)

Doi: https://doi.org/10.54216/JISIoT.150109

Received: June 30, 2024 Revised: September 26, 2024 Accepted: December 25, 2024

Abstract

Speech-to-text Conversion is a type of Speech Recognition Program that effectively takes audio content as input and transcribes it into written words. With increasing technologies and large data corpus, the importance of speech recognition has increased. Now everyone seems to be exploitation Speech Recognition Technology for users to work a tool, perform commands, or write while not having to use a keyboard, mouse, or press any buttons. It is also easy for everyone to utter sound or speak than using hands to be work done and it is also convenient to use. In this paper, a system capable of converting audio files to text has been developed. The proposed system consists of a set of algorithms for processing audio files, where the MFCC algorithm combine with standard deviation was adopted to extract the features of the audio file and convert it into an image. The features of audio files are stored as images because deep learning algorithms can be trained on images better than CSV files. The second part of the proposed system is the design of a deep learning model in which two algorithms, Convolutional Neural Network (CNN) and Deep Neural Network (DNN) are combined to predict words. The model consists of a set of layers to extract the features from the images, choose the best features, then train and classify them based on the proposed DNN model. In this thesis, three types of datasets (Arabic, English, and Real) were adopted to test the proposed system in speech prediction and the accuracy of the proposed system has reached more than 95%.

Keywords :

Speech Recognition , Convolutional Neural Network , Deep Neural Network , MFCC

References

[1] H. A. Abdulmohsin, et al., "Automatic illness prediction system through speech," compute. Electr. Eng., vol. 102, p. 108224, 2022.

[2] H. A. Abdulmohsin, "A new proposed statistical feature extraction method in speech emotion recognition," Compute. Electr. Eng., vol. 93, p. 107172, 2021.

[3] Z. K. Mohammed and N. A. Z. Abdullah, "Survey for Arabic part of speech tagging based on machine learning," Iraqi J. Sci., vol. 63, no. 8, pp. 2676-2685, 2022.

[4] A. A. Hussien and N. A. Z. Abdullah, "A review for Arabic sentiment analysis using deep learning," Iraqi J. Sci., vol. 64, no. 12, 2023.

[5] A. R. Ali, "Multi-dialect Arabic speech recognition," in Proc. 2020 Int. Joint Conf. Neural Networks (IJCNN), 2020.

[6] P. D. Reddy, C. Rudresh, and A. S. Adithya, "Multilingual speech recognition methods using deep learning and cosine similarity," CS & IT Conf. Proc., vol. 12, no. 7, pp. 1-7, 2022.

[7] H. P. Arun, et al., "Malayalam speech to text conversion using deep learning," IOSR J. Eng., vol. 11, no. 7, pp. 24-30, 2021.

[8] A. Alsobhani, H. M. A. ALabboodi, and H. Mahdi, "Speech recognition using convolution deep neural networks," J. Phys.: Conf. Ser., vol. 1973, no. 1, 2021.

[9] E. R. Abdelmaksoud, et al., "Convolutional neural network for Arabic speech recognition," Egypt. J. Lang. Eng., vol. 8, no. 1, pp. 27-38, 2021.

[10] A. Bhavani and N. R. Moparthi, "Speech recognition using the NN," Int. J. Adv. Res. Eng. Technol., vol. 11, no. 6, pp. 2663-2671, 2020.

[11] C. Sridhar and A. Kanhe, "Performance comparison of various neural networks for speech recognition," J. Phys.: Conf. Ser., vol. 2466, no. 1, 2023.

[12] K. Yalova, M. Babenko, and K. Yashyna, "Automatic speech recognition system with dynamic time warping and mel-frequency cepstral coefficients," COLINS, vol. 2, pp. 1-7, 2023.

Cite This Article As :

Luhaib, Hadeel. , Ali, Husam. A Hybrid Speech Recognition System Using Deep Learning Methods. Journal of Intelligent Systems and Internet of Things, vol. , no. , 2025, pp. 105-121. DOI: https://doi.org/10.54216/JISIoT.150109

Luhaib, H. Ali, H. (2025). A Hybrid Speech Recognition System Using Deep Learning Methods. Journal of Intelligent Systems and Internet of Things, (), 105-121. DOI: https://doi.org/10.54216/JISIoT.150109

Luhaib, Hadeel. Ali, Husam. A Hybrid Speech Recognition System Using Deep Learning Methods. Journal of Intelligent Systems and Internet of Things , no. (2025): 105-121. DOI: https://doi.org/10.54216/JISIoT.150109

Luhaib, H. , Ali, H. (2025) . A Hybrid Speech Recognition System Using Deep Learning Methods. Journal of Intelligent Systems and Internet of Things , () , 105-121 . DOI: https://doi.org/10.54216/JISIoT.150109

Luhaib H. , Ali H. [2025]. A Hybrid Speech Recognition System Using Deep Learning Methods. Journal of Intelligent Systems and Internet of Things. (): 105-121. DOI: https://doi.org/10.54216/JISIoT.150109

Luhaib, H. Ali, H. "A Hybrid Speech Recognition System Using Deep Learning Methods," Journal of Intelligent Systems and Internet of Things, vol. , no. , pp. 105-121, 2025. DOI: https://doi.org/10.54216/JISIoT.150109

Journal of Intelligent Systems and Internet of Things

Journal Menu

Journal Volumes

Volume 0

Volume 1

Volume 2

Volume 3

Volume 4

Volume 5

Volume 6

Volume 7

Volume 8

Volume 9

Volume 10

Volume 11

Volume 12

Volume 13

Volume 14

Volume 15

Volume 16

Volume 17

Volume 18

A Hybrid Speech Recognition System Using Deep Learning Methods

Abstract

Keywords :

References

Cite This Article As :

Article Statistics

Download