<p>Hybrid Ensemble Learning for Flow-Level IoT Traffic Classification</p>
<p>Using ACI Dataset: Towards Scalable and Real-Time Threat Detection</p>
<p>El-Sayed M. El-Kenawy1,2, Sini Raj Pulari3,&lowast; , Shriram K Vasudevan4</p>
<p>1School of ICT, Faculty of Engineering, Design and Information and Communications Technology (EDICT),</p>
<p>Bahrain Polytechnic, PO Box 33349, Isa Town, Bahrain</p>
<p>2Applied Science Research Center. Applied Science Private University, Amman, Jordan</p>
<p>3Dept. of CSE, Vignan&rsquo;s Foundation for Science, Technology and Research, Guntur, Andhra Pradesh, India</p>
<p>4Intel India Pvt. Ltd., Bengaluru, India</p>
<p>Emails: skenawy@ieee.org; sinikishan@gmail.com; shriram.kris.vasudevan@intel.com</p>
<p>Abstract</p>
<p>Internet of Things devices, which spread across consumer industrial and critical infrastructure domains, have</p>
<p>boosted the quantity of diverse network traffic and its high frequency. The increasing scale of IoT networks</p>
<p>causes problems securing the diverse data flow within these networks, threatening system performance and</p>
<p>management capabilities. Analyzing network traffic with traditional methods based on signature identification</p>
<p>and rule detection becomes ineffective for new traffic activity patterns and system behavior. Due to extensive</p>
<p>growth in IoT networks, developing intelligent data-based classification systems that can process IoT traffic</p>
<p>quickly and at large operational scales becomes essential. A detailed model of flow-level data-based ma-</p>
<p>chine learning operations for IoT traffic classification utilizes features extracted from the Army Cyber Institute</p>
<p>(ACI) IoT dataset. The dataset encompasses statistical, temporal, and protocol-specific attributes for benign</p>
<p>and malicious network flows. Our methodology first conducts a strict data preprocessing stage, which involves</p>
<p>numerous operations such as cleaning the data, normalizing it and encoding the labels, and performing a fea-</p>
<p>ture correlation analysis before preparing the learning algorithms with a suitable quality and balanced dataset.</p>
<p>Various classification models underwent training, including Linear Discriminant Analysis (LDA), Quadratic</p>
<p>Discriminant Analysis (QDA), Naive Bayes and SGD Classifiers, and statistical learners. Our proposed hy-</p>
<p>brid ensemble method combines weighted voting between a deep learning neural network, a Random Forest</p>
<p>model, and an XGBoost classifier to overcome the limitations of single classifiers. This ensemble model</p>
<p>aimed to make the system more resilient while lowering bias and enhancing its ability to understand various</p>
<p>IoT traffic patterns. A complete set of evaluation metrics assessed the models, using accuracy, precision, recall,</p>
<p>F1-score, Hamming loss, Matthews correlation coefficient (MCC) and Cohen&rsquo;s Kappa plus balanced accuracy</p>
<p>and log loss for assessment. The chosen metrics allowed researchers to monitor model performance from</p>
<p>global and detailed perspectives when dealing with imbalanced classes and similar patterns between legitimate</p>
<p>and malicious network traffic. The ensemble methodology produces superior results than individual classifiers</p>
<p>demonstrated through experimental results under all performance metrics evaluation. The complex nature</p>
<p>of network environments demonstrates that model fusion achieves excellent results when tracking non-easy-</p>
<p>to-classify traffic patterns. The ensemble approach proves excellent generalization properties and optimized</p>
<p>performance for real-time IoT implementations because of its ability to adapt continuously while maintaining</p>
<p>high accuracy levels. This proposed framework adds to intelligent IoT traffic analysis research while demon-</p>
<p>strating how deep learning and traditional machine learning methods enhance ensemble systems. The system</p>
<p>develops an expandable and clear quantitative solution that can be implemented for advanced network security</p>
<p>systems and traffic monitoring applications across smart cities industrial settings, and critical infrastructure</p>
<p>frameworks.</p>
<p>Keywords: IoT Traffic Classification; Ensemble Learning; Deep Learning; Flow-Based Analysis</p>