Feature Selection for Multi-Class Traffic Classification and
Anomaly Detection in Heterogeneous Wireless Sensor Networks
Zainab Hussein Arif1, Nureize bt Arbaiy2,⋆
1College of Nursing, University of Al-Qadisiyah, Al-Qadisiyah Province, 58002, Iraq
2Fakulti Sains Komputer dan Teknologi Maklumat, Universiti Tun Hussein Onn Malaysia (UTHM), 86400 Batu Pahat, Johor,
Malaysia
Emails: zhussian94@gmail.com; nureize@uthm.edu.my
Received: January 26, 2026 Revised: April 01, 2026 Accepted: May 04, 2026 ⋆ Corresponding author
ABSTRACT
Heterogeneous Internet-of-Things deployments expose wireless sensor networks to a diverse and continuously
evolving threat landscape encompassing distributed denial-of-service flooding, network reconnaissance scanning,
and brute-force credential attacks. Existing intrusion detection approaches predominantly adopt single-classifier
architectures and binary labelling, which are ill-suited to the multi-class, class-imbalanced traffic characteristic of
real-world IoT sensor deployments. This paper proposes WS-STACK, a Weighted Stacking ensemble that combines
five heterogeneous base learners—Random Forest, XGBoost, Support Vector Machine, K-Nearest Neighbours,
and Gradient Boosting—under an ℓ2-regularised Logistic Regression meta-learner trained on cross-validationgenerated
probability features. A three-stage feature engineering pipeline comprising mutual information filtering,
variance inflation factor pruning, and correlation-based elimination reduces the 83-dimensional RT-IoT2022 feature
space to 20 informative features, and the Synthetic Minority Over-Sampling Technique corrects the six-fold class
imbalance prior to training. Evaluated on 83,000 labelled network flow records from the publicly available RTIoT2022
benchmark spanning four benign traffic patterns and seven attack categories, WS-STACK achieves 99.61%
classification accuracy, a weighted F1-score of 0.9960, and an AUC-ROC of 0.9978, outperforming every individual
base classifier and five recently published state-of-the-art baselines. The false positive rate is reduced to 0.0006, and
ten-fold cross-validation confirms μacc = 0.9959 (σ = 0.0004). Ablation experiments identify SMOTE as the single
most critical preprocessing component, and noise-robustness tests confirm 98.81% accuracy under 20% Gaussian
feature perturbation. The framework is grounded through a formal variance-reduction proof and a channel-energy
anomaly model that establishes the physical motivation for packet-rate features as the dominant intrusion detection
signal in constrained wireless sensor networks.
Keywords: Wireless sensor networks IoT security Ensemble learning Stacking classifier RT-IoT2022 dataset
Multi-class intrusion detection Feature selection SMOTE Anomaly detection