Volume 7 , Issue 2 , PP: 100-109, 2022 | Cite this article as | XML | Html | PDF | Full Length Article
Praveen Singh 1 * , Preeti Nagrath 2
Doi: https://doi.org/10.54216/FPA.070204
One of the major factors for personal development and growth is understanding human emotions, and therefore it plays an important role in imitating human intelligence. Vocal and Sentiment analysis are the major focus points for advancement in Artificial Intelligence (AI). Sentiment analysis provides major help to data analysts of big enterprises to measure public opinion, conducting market research, understanding customers experience and viewing brand and product reputation. Emotion recognition provides an opportunity to grasp the general people’s sentiments about social events, marketing strategies, political views and product liking. In this paper, we have used various AI models on a variety of audio datasets to recognise and analyse the sentiments of the speaker. Our dataset includes some audio songs sung by some singers and some audio clips of few actors. We trained CNN and LSTM models to analyse our dataset and predict their accuracy. The ever-growing need of sentiment analysis coincides greatly with the extension of social media such as forum discussions, social networks like Facebook, Twitter, Instagram and many other similar platforms.
Vocal Analysis , Sentiment Discernment , Artificial Intelligence , Personal development
[1] Buyukyilmaz, M., & Cibikdiken, A. O. (2016). Voice Gender Recognition Using Deep Learning.
[2] Byun, S. W., & Lee, S. P. (2021). A Study on a Speech Emotion Recognition System with Effective\ Acoustic Features Using Deep Learning Algorithms. Applied Sciences 2021( Vol. 11,No. 4,p. 1890).
[3] Bhatti, M. W., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion recognition in speech. Proceedings - IEEE International Symposium on Circuits and Systems, 2.
[4] Langari, S., Marvi, H., & Zahedi, M. (2020). Efficient speech emotion recognition using modified feature extraction. Informatics in Medicine Unlocked( Vol. 20,p. 100424)
[5] Huang, C., Gong, W., Fu, W., & Feng, D. (2014). A research of speech emotion recognition based on deep belief network and SVM. Mathematical Problems in Engineering, 2014.
[6] KoĊakowska, A., Landowska, A., Szwoch, M., Szwoch, W., & Wróbel, M. R. (2014). Emotion Recognition and Its Applications. Advances in Intelligent Systems and Computing, 300, 51–62.
[7] Mauchand, M., & Pell, M. D. (2020). Emotivity in the Voice: Prosodic, Lexical, and Cultural Appraisal of Complaining Speech. Frontiers in Psychology, 11
[8] Tawari, A., & Trivedi, M. M. (2010). Speech emotion analysis: Exploring the role of context. IEEE Transactions on Multimedia, 12(6), 502–509.
[9] Pérez-Espinosa, H., Zatarain-Cabada, R., & Barrón-Estrada, M. L. (2022). Emotion recognition: from speech and facial expressions. Biosignal Processing and Classification Using Computational Learning and Intelligence(pp. 307–326).
[10] Ramdinmawii, E., Mohanta, A., & Mittal, V. K. (2017). Emotion recognition from speech signal. IEEE Region 10 Annual International Conference, Proceedings/TENCON-2017 (pp.1562–1567).
[11] Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing and Applications, 9(4), 290–296.
[12] Al-Talabani, A., Sellahewa, H., & Jassim, S. A. (2015). Emotion recognition from speech: tools and challenges. Mobile Multimedia/Image Processing, Security, and Applications 2015, 9497, 94970N.
[13] Luoh, L., Su, Y. Z., & Hsu, C. F. (2010). Speech signal processing based emotion recognition. 2010 International Conference on System Science and Engineering, ICSSE 2010, 487–490
[14] Soltani, K., & Ainon, R. N. (2007). Speech emotion detection based on neural networks. 2007 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, Proceedings.
[15] Nam, Y., & Lee, C. (2021). Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions. Sensors 2021( Vol. 21, No. 13, p. 4399)
[16] Kurpukdee, N., Koriyama, T., Kobayashi, T., Kasuriya, S., Wutiwiwatchai, C., & Lamsrichan, P. (2018). Speech emotion recognition using convolutional long short-term memory neural network and support vector machines. Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (pp. 1744–1749)
[17] Yu, Y., & Kim, Y. J. (2020). Attention-LSTM-attention model for speech emotion recognition and analysis of IEMOCAP database. Electronics (Switzerland), 9(5).
[18] Lech, M., Stolar, M., Best, C., & Bolia, R. (2020a). Real-Time Speech Emotion Recognition Using a Pre-trained Image Classification Network: Effects of Bandwidth Reduction and Companding. Frontiers in Computer Science, 2.
[19] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020b). Automatic Speech Emotion Recognition Using Machine Learning. Social Media and Machine Learning.
[20] Tzirakis, P., Trigeorgis, G., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2017). End-to-End Multimodal Emotion Recognition using Deep Neural Networks. IEEE Journal on Selected Topics in Signal Processing, 11(8), 1301–1309.
[21] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., & Mahjoub, M. A. (2018). Speech emotion recognition: Methods and cases study. ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2, 175–182.
[22] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020a). Automatic Speech Emotion Recognition Using Machine Learning. Social Media and Machine Learning.
[23] Home | Cheriton School of Computer Science | University of Waterloo. (n.d.).
[24] MFCC Technique for Speech Recognition - Analytics Vidhya. (n.d.).
[25] Kadiri, S. R., & Alku, P. (2019). Mel-frequency cepstral coefficients of voice source waveforms for classification of phonation types in speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, (pp.2508–2512).
[26] Lech, M., Stolar, M., Best, C., & Bolia, R. (2020b). Real-Time Speech Emotion Recognition Using a Pre-trained Image Classification Network: Effects of Bandwidth Reduction and Companding. Frontiers in Computer Science (Vol. 2, p. 14).
[27] Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., … Farhan, L. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data 2021 8:1, 8(1), 1–74.
[28] LSTM | Introduction to LSTM | Long Short Term Memor. (n.d.).
[29] Narv e , F. . 2021,Decemeber). Smart technologies, systems and applications : Second International Conference, SmartTech-IC 2021.
[30] Guo, J. (2022). Deep learning approach to text analysis for human emotion detection from big data. Journal of Intelligent Systems, 31(1), 113–126.