Journal of Cybersecurity and Information Management
JCIM
2690-6775
2769-7851
10.54216/JCIM
https://www.americaspg.com/journals/show/3402
2019
2019
Analyzing the Effectiveness of Machine Learning Techniques in Detecting Attacks in a Big Data Environment
Electronic Computer Center, University of Fallujah, Anbar, Iraq
admin
admin
Electronic Computer Center, University of Fallujah, Anbar, Iraq
Osamah M.
Abduljabbar
Electronic Computer Center, University of Fallujah, Anbar, Iraq
Huda Mohammed
Lateef
Protecting big data has become an extremely vital necessity in the context of cybersecurity, given the significant impact that this data has on institutions and clients. The importance of this type of data is highlighted as a basis for decision-making processes and policy guidance. Therefore, attacks on this data can lead to serious losses through illicit access, resulting in a loss of integrity, reliability, confidentiality, and availability of this data. The second problem in this context arises from the necessity of reducing the attack detection period and its vital importance in classifying malicious and non-harmful patterns. Structured Query Language Injection Attack (SQLIA) is among the common attacks targeting data, which is the focus of interest in the proposed model. The aim of this research revolves around developing an approach aimed at detecting and distinguishing patterns of loads sent by the user. The proposed method is based on training a model using random forest technology, which is considered one of the machine learning (ML) techniques while taking advantage of the Spark ML library that interacts effectively with big data frameworks. This is accompanied by a comprehensive analysis of the effectiveness of ML techniques in monitoring and detecting SQLIA. The study was conducted using the SQL dataset available on the Kaggle platform and showed promising results as the proposed method achieved an accuracy of 98.12%. While the proposed approach takes 0.046 seconds to determine the SQL type. It is concluded from these results that using the Spark ML library based on ML techniques contributes to achieving higher accuracy and requires less time to identify the class of request sent due to its ability to be distributed in memory.
2025
2025
285
292
10.54216/JCIM.150221
https://www.americaspg.com/articleinfo/2/show/3402