Journal of Cognitive Human-Computer Interaction

Journal DOI

https://doi.org/10.54216/JCHCI

Submit Your Paper

2771-1463ISSN (Online) 2771-1471ISSN (Print)
Full Length Article

Journal of Cognitive Human-Computer Interaction

Volume 3, Issue 1, PP: 36-41, 2022 | Cite this article as | XML | | Html PDF

An Approach for Devising Stenography Application Using Cross Modal Attention

Shanthalakshmi M   1 * , Susmita Mishra   2 , LincyJemina S   3 , Raashmi P   4 , Mannuru Shalin   5 , Jananeee.v   6

  • 1 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (shanthalakshmi.m@rajalakshmi.edu.in)
  • 2 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (susmitamishra12@gmail.com)
  • 3 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (lincypit@gmail.com)
  • 4 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (raashmi.p.2018.cse@rajalakshmi.edu.in)
  • 5 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (mannuru.shalini.2018.cse@rajalakshmi.edu.in)
  • 6 Rajalakshmi Engineering College, Panimalar Institute of Technology,India - (jananee.v@rajalakshmi.edu.in)
  • Doi: https://doi.org/10.54216/JCHCI.030105

    Received: January 15, 2022 Accepted: May 26, 2022
    Abstract

    This paper focuses on providing a solution to the direct conversion of speech to shorthand. Since shorthand is not understood by many but is used for writing quick transcripts, a product is developed that converts the speech to its appropriate Gregg shorthand. A website that will be used as a front end, will use a speech-to-text API to record the speech in real-time. The converted text will then be fed into a text-to-image retrieval model that derives its corresponding Gregg shorthand for the text. The text will then be displayed to the user in real-time. By achieving this, the model reduces the need to depend upon stenographers for transcribing scripts. The resulting model achieves a good result.

    Keywords :

    Devising Stenography , Cross Modal Attention , speech shorthand , speech conversion

    References

    [1] DionisA. Padilla, Nicole Kim U. Vitug and Julius Benito S. Marquez., “Deep learning approach in

    Gregg shorthand word to English word conversion” (2020)

    [2] ZhongJi and Kexin Chen, “Step-Wise Hierarchical Alignment Network for Image-Text Matching ’’

    (2021)

    [3] Xing Xu, Tan Wang, Yang Yang, Lin Zuo, FuminShen, and Heng Tao Shen, “Cross Model Attention

    with Semantic Consistence for Image Text Matching’’ (2020)

    [4] Neha Sharma andShipraSardana, “A Real-Time Speech to Text Conversion system using Bidirectional

    Kalman Filter Matlab’’(2016)

    [5] Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu and Xiaodong He, ”Stacked Cross Attention for

    Image-Text Matching” (2018)

    [6] K. R. Abhinand and H. K. AnasuyaDevi,“An Approach for Generating Pattern-Based Shorthand Using

    Speech-to-Text Conversion and Machine Learning ’’ (2013)

    [7] R.Rajasekaran , K.Ramar, “Handwritten Gregg Shorthand Recognition’’ in International Journal of

    Computer Applications (2012)

    [8] Zihao Wang , Xihui Liu , Hongsheng Li , Lu Sheng , JunjieYan , Xiaogang Wang and Jing Shao,

    “CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval’’ in IEEE/CVF

    International Conference on Computer Vision (ICCV) (2019)

    [9] StanislavFrolov , Tobias Hinz , Federico Raue , J¨ornHees and Andreas Dengel, “Adversarial Text-to-

    Image Synthesis: A Review” (Neural Networks Journal,2021)

    [10] SaifuddinHitawala, “Comparative Study on Generative Adversarial Networks’’(2018)

    [11] Cheng Wang, Haojin Yang, Christian Bartz and ChristophMeinel, “Image Captioning with Deep

    Bidirectional LSTMs’’ (2016)

    [12] Daniela Onita , Adriana Birlutiu and Liviu P. Dinu, “Towards Mapping Images to Text Using Deep-

    Learning Architectures’’ (2020)

    [13] Christine Dewi , Rung-Ching Chen , Yan-Ting Liu and Hui Yu , " Various Generative Adversarial

    Networks Model for Synthetic Prohibitory Sign Image Generation'' , (2021)

    [14] Hao Wu , Jiayuan Mao , Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, and Wei-Ying Ma.,

    "Unified Visual-Semantic Embeddings: Bridging Vision and Language with Structured Meaning

    Representations'' , (2019)

    [15] Scott Reed, ZeynepAkata, Xinchen Yan, LajanugenLogeswaran , BerntSchiele and Honglak Lee,

    "Generative Adversarial Text to Image Synthesis'' , (2016)

    Cite This Article As :
    Shanthalakshmi M, Susmita Mishra , LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v. "An Approach for Devising Stenography Application Using Cross Modal Attention." Full Length Article, Vol. 3, No. 1, 2022 ,PP. 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)
    Shanthalakshmi M, Susmita Mishra , LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v. (2022). An Approach for Devising Stenography Application Using Cross Modal Attention. Journal of , 3 ( 1 ), 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)
    Shanthalakshmi M, Susmita Mishra , LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v. "An Approach for Devising Stenography Application Using Cross Modal Attention." Journal of , 3 no. 1 (2022): 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)
    Shanthalakshmi M, Susmita Mishra , LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v. (2022). An Approach for Devising Stenography Application Using Cross Modal Attention. Journal of , 3 ( 1 ), 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)
    Shanthalakshmi M, Susmita Mishra , LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v. An Approach for Devising Stenography Application Using Cross Modal Attention. Journal of , (2022); 3 ( 1 ): 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)
    Shanthalakshmi M, Susmita Mishra, LincyJemina S, Raashmi P, Mannuru Shalin, jananeee.v, An Approach for Devising Stenography Application Using Cross Modal Attention, Journal of , Vol. 3 , No. 1 , (2022) : 36-41 (Doi   :  https://doi.org/10.54216/JCHCI.030105)