The defect prediction in the manufacturing of steel is a critical challenge because it affects the quality and safety of the products. For this reason, intelligent image fusion approach is introduced in this research to enhance accurate prediction of defect types and locations in steel materials. By utilizing U-Net architecture and pretrained ResNet18 encoder layers, our method performs fusion of data from several imaging modalities thus supporting precise localization as well as classification of defects. In our model’s learning curves as well as comparing predicted segmentation masks with ground truth images, extensive experimentation and visualization show that our model captures subtle defects very well. By so doing, it exhibits robust performance that mitigates risks associated with overfitting since it can accurately identify any flaw while still having the ability to accept unseen data from other sources. These results suggest that our approach can highly contribute to improving quality control and safety standards for steel production.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080201
Vol. 8 Issue. 2 PP. 08-15, (2024)
Driver monitoring systems have been improved over time as artificial intelligence and computer technology have advanced. Several experimental studies have collected real-world driver drowsiness data and used various artificial intelligence algorithms and feature combinations to dramatically improve the real-time effectiveness of these systems. This study presents an updated assessment of the driver sleepiness detection systems implemented over the last decade. In modern automobiles, assessing the driver's cognitive condition is an important aspect of passenger safety. The term "cognitive state" refers to a driver's mental and emotional state, which has a substantial impact on their ability to drive safely. Drivers' cognitive states may be altered by factors such as fatigue, distraction, stress, or disability. Intelligent automotive technology may be able to adapt and aid the driver by identifying varied conditions in real-time, reducing the frequency of accidents. The face, being an integral component of the body, communicates a significant quantity of information. The facial expressions, such as blinking and yawning patterns, exhibit changes in a driver when they are experiencing fatigue.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080202
Vol. 8 Issue. 2 PP. 16-24, (2024)
This paper unveils an advanced chatbot engineered to cater specifically to college-related inquiries. Harnessing the power of BARD and incorporating a wake word activation system with automatic speech recognition, the chatbot offers an enhanced user experience marked by both linguistic sophistication and spoken command initiation. The methodology encompasses the nuanced process of pre-training on diverse corpora, fine-tuning to optimize responsiveness to college-specific queries, and the seamless integration of intent classification and entity recognition. These facets collectively empower the chatbot to understand and respond effectively to the intricacies of user inputs. A comprehensive knowledge base is curated to ensure not only accurate information retrieval but also to foster a depth of contextual understanding. This project signifies a pioneering leap in providing an innovative, user-friendly, and ethically driven solution for addressing college-related queries through natural language interactions. By showcasing practical advancements in chatbot technology tailored to the educational landscape, this research contributes to the evolving landscape of intelligent virtual assistants.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080203
Vol. 8 Issue. 2 PP. 25-31, (2024)
Music holds significant sway in enriching the lives of individuals, serving as a vital source of entertainment for enthusiasts and listeners alike. Moreover, it transcends mere amusement, often adopting a therapeutic role in people’s lives. In the ever-evolving landscape of music and technology, this project emerges as a groundbreaking endeavor, driven by the profound impact music holds on individuals’ lives. Leveraging technological advancements in music players, such as playback control and genre classification, our focus is on revolutionizing playlist creation. Instead of the laborious manual curation of playlists, we introduce automation based on users’ emotional states, identified through real-time facial expression analysis via a camera. The human face, a rich source of mood indicators, becomes the key input for our system. By directly extracting emotional cues from facial expressions, the project aims to swiftly deduce the user’s emotional state, crafting a tailored playlist without the need for time-consuming manual efforts. Implemented through deep learning using VGG16 model, the system ensures intricate emotion recognition from image input. Python, OpenCV, and Keras facilitate seamless video processing and deep learning functionalities, complemented by a music player library for smooth playback control. This amalgamation of computer vision and deep learning delivers an interactive music player that dynamically selects tracks aligned with users’ real-time emotional expressions, offering a personalized and immersive musical experience.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080204
Vol. 8 Issue. 2 PP. 32-45, (2024)
Brain-Computer Interface (BCI) technology stands as a groundbreaking innovation, revolutionizing the way individuals with severe motor disabilities interact with the world. The integration of Electroencephalogram (EEG) sensors within applications like the Brain Keyboard marks a pivotal stride forward. By capturing and interpreting brain signals triggered by simple actions such as eye blinking, these sensors empower users to control a virtual keyboard, transcending the limitations imposed by traditional motor pathways. This direct channel between the human brain and external devices offers an unprecedented avenue for communication, particularly invaluable for those grappling with conditions like paralysis or locked-in syndrome. The profound impact of BCIs extends far beyond facilitating textual communication; they represent a lifeline, a bridge toward autonomy and engagement for individuals facing profound physical challenges. Through these interfaces, users can articulate thoughts, express emotions, and actively participate in social interactions, fundamentally enhancing their quality of life. This technological marvel not only breaks down communication barriers but also holds promise in broader applications. As BCIs evolve, their potential encompasses enabling control over robotic prosthetics, granting users the ability to accomplish tasks once deemed impossible. Moreover, the implications of BCIs stretch into the realm of neuroscience, offering a unique window into understanding cognitive processes and neurological disorders. The ability to decode and interpret brain activity not only aids in facilitating communication but also paves the way for groundbreaking research and potential therapies. Challenges persist, such as enhancing signal accuracy and streamlining usability, yet the remarkable benefits that BCIs offer to individuals with motor disabilities continue to fuel ongoing innovation in this dynamic field. Ultimately, the fusion of EEG sensors, processing units, and user interfaces in BCIs heralds a new era of inclusivity and empowerment, where individuals previously marginalized by physical limitations find newfound avenues for expression, interaction, and independence. This transformative technology not only unlocks communication but also holds the key to reshaping our understanding of the human brain and its intricate workings, promising a f uture where disabilities no longer confine one's ability to engage with the world.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080205
Vol. 8 Issue. 2 PP. 46-54, (2024)
This research develops a novel approach for mood-based YouTube video suggestions. Using cutting-edge textual data analysis techniques, through the application of Natural Language Processing (NLP) techniques combined with sentiment analysis based on the FrameNet framework, users' everyday experiences and feelings are carefully analyzed to determine their current mood in the text. The process of content curation is made easier by the extraction of pertinent video metadata with the help of the YouTube API key. The integration of video metadata with textual mood extraction allows for the development of an extremely engaging and personalized content recommendation system. Users are provided with content that resonates with their current emotional state by matching the recommended movies' mood with the one deduced from the textual input. This improves user satisfaction and enriches their experience.
Read MoreDoi: https://doi.org/10.54216/JCHCI.080206
Vol. 8 Issue. 2 PP. 55-62, (2024)