Journal of Cybersecurity and Information Management
  JCIM
  2690-6775
  2769-7851
  
   10.54216/JCIM
   https://www.americaspg.com/journals/show/3590
  
 
 
  
   2019
  
  
   2019
  
 
 
  
   Implementing Comparative Analysis on Feature Engineering Techniques and Multi-Model Evaluation Framework for IDS
  
  
   Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar, 125001, Haryana, India
   
    Neha
    Neha
   
   Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar, 125001, Haryana, India
   
    Abhishek
    Kajal
   
  
  
   In recent years, most of the current intrusion detection methods run for critical information infrastructure are tested for IDS datasets, but does not provide desired protection against emerging cyber- threats. Most machine and deep learning-based intrusion detection methods are inefficient on networks due to their high imbalanced or noisy IDS datasets. Therefore, in this paper, our proposed work implements a comprehensive framework, using multiple models of machine learning and deep learning by taking advantage of advanced feature engineering approaches. Our research explores the impacts of a variety of feature engineering approaches on dimensionality reduction methods used to train and test model performance with execution time taken on the CICIDS2017 dataset to reduce the time complexity and enhance performance to detect intrusion by experiment and leveraging feature engineering techniques like PCA (Principal Component Analysis), LDA (Linear Discriminant Analysis), t_SNE (t-Distributed Stochastic Neighbor Embedding), and Autoencoders. This framework also resolves the class imbalance issues by using SMOTE (Synthetic Minority Oversampling Technique), generates synthetic samples of those classes, which have a very low number of samples to balance the class for a better model performance. Our comparative analysis is performed on metrics like accuracy, training time and memory usage for machine learning models like Gradient Boosting, Logistic Regression, XGBoost and deep learning models. DL with LDA feature engineering approach achieved the highest test accuracy of 95.99% and Gradient Boosting shows strong performance by attaining a high-test accuracy of 90.8%. Illustrated DL model had higher memory usage, but LR and XG- Boost models performed computationally efficient. Further, it is observed that LDA performed better with ML and DL models in comparison to other feature engineering techniques to enhance the intrusion detection efficiency.
  
  
   2025
  
  
   2025
  
  
   53
   67
  
  
   10.54216/JCIM.160105
   https://www.americaspg.com/articleinfo/2/show/3590