Journal of Cybersecurity and Information Management

Journal DOI

https://doi.org/10.54216/JCIM

Submit Your Paper

2690-6775ISSN (Online) 2769-7851ISSN (Print)

Volume 17 , Issue 1 , PP: 10-20, 2026 | Cite this article as | XML | Html | PDF | Full Length Article

Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques

Ali Subhi Alhumaima 1 , Waleed Khalid Al-Zubaidi 2 , El-Sayed M. El-Kenawy 3 * , Marwa M. Eid 4

  • 1 Electronic Computer Centre, University of Diyala, Diyala, Iraq - (alhumaimaali@uodiyala.edu.iq)
  • 2 Electronic Computer Centre, University of Diyala, Diyala, Iraq - (waleed300@uodiyala.edu.iq)
  • 3 Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology, Mansoura, 35111, Egypt; Applied Science Research Center. Applied Science Private University, Amman, Jordan - (skenawy@ieee.org)
  • 4 Faculty of Artificial Intelligence, Delta University for Science and Technology, Mansoura, Egypt; Jadara Research Center, Jadara University, Irbid 21110, Jordan - (mmm@ieee.org)
  • Doi: https://doi.org/10.54216/JCIM.170102

    Received: March 14, 2025 Revised: June 03, 2025 Accepted: July 10, 2025
    Abstract

    In this study, we present an integrated approach to IoT-based environmental data analysis using a collection of unsupervised-learning techniques. We employed KMeans clustering in particular to identify natural groupings in environmental and behavioral features such as air quality, noise level, temperature, stress level, sleeping hours, and mood score. We then trained a Decision Tree classifier to predict and interpret cluster membership from raw sensor readings. The data of more than 30,000 observations in indoor school environments has multifaceted relationships between environmental factors and psychological well-being. KMeans consistently detected three environmental-behavioral states, and the Decision Tree classifier performed 87% classification accuracy, which indicated extremely high predictability power in addition to interpretability. The results indicated that sleep duration, air, and stress were the main factors for cluster discrimination. The hybrid model introduces the potential of observing real-time environmental and mental states for applications in smart cities. The approach is scalable, interpretable, and usable in IoT settings for proactivity-enabled wellness management.

    Keywords :

    IoT Sensor Data , Environmental Monitoring , KMeans Clustering , Decision Tree Classification , Behavioral Analysis , Air Quality , Stress Prediction , Machine Learning, Data Mining

    References

    [1]       K. Ashton, “That ‘internet of things’ thing,” RFID Journal, vol. 22, no. 7, pp. 97–114, 2009.

     

    [2]       T. H. Davenport, P. Barth, and R. Bean, “How ‘big data’ is different,” 2012.

     

    [3]       V. Marx, “The big challenges of big data,” Nature, vol. 498, no. 7453, pp. 255–260, 2013.

     

    [4]       J. Fan, F. Han, and H. Liu, “Challenges of big data analysis,” National Science Review, vol. 1, no. 2, pp. 293–314, 2014.

     

    [5]       K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, 1999.

     

    [6]       K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit. Lett., vol. 31, no. 8, pp. 651–666, 2010.

     

    [7]       Likas, N. Vlassis, and J. J. Verbeek, “The global k-means clustering algorithm,” Pattern Recognit., vol. 36, no. 2, pp. 451–461, 2003.

     

    [8]       J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. The Morgan Kaufmann Series in Data Management Systems, vol. 5, no. 4, pp. 83–124, 2011.

     

    [9]       H. S. Park and C. H. Jun, “A simple and fast algorithm for K-medoids clustering,” Expert Syst. Appl., vol. 36, no. 2, pp. 3336–3341, 2009.

     

    [10]    M. Van der Laan, K. Pollard, and J. Bryan, “A new partitioning around medoids algorithm,” J. Stat. Comput. Simul, vol. 73, no. 8, pp. 575–584, 2003.

     

    [11]    M. Ramadas and A. Abraham, Metaheuristics for Data Clustering and Image Segmentation, Springer, 2019.

     

    [12]    P. D. McNicholas, “Model-based clustering,” J. Classification, vol. 33, no. 3, pp. 331–373, 2016.

     

    [13]    V. Melnykov and R. Maitra, “Finite mixture models and model-based clustering,” Stat. Surv., vol. 4, pp. 80–116, 2010.

     

    [14]    J. Vesanto and E. Alhoniemi, “Clustering of the self-organizing map,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 586–600, 2000.

     

    [15]    J. W. Lau and P. J. Green, “Bayesian model-based clustering procedures,” J. Comput. Graph. Stat., vol. 16, no. 3, pp. 526–558, 2007.

     

    [16]    M. Meilă and D. Heckerman, An experimental comparison of model-based clustering methods, Mach. Learn., vol. 42, no. 1, pp. 9–29, 2001.

     

    [17]    E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN,” ACM Trans. Database Syst., vol. 42, no. 3, pp. 1–21, 2017.

     

    [18]    M. Ankerst, M. M. Breunig, H. P. Kriegel, and J. Sander, “OPTICS: Ordering points to identify the clustering structure,” ACM SIGMOD Rec., vol. 28, no. 2, pp. 49–60, 1999.

     

    [19]    K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-means clustering with background knowledge,” in Proc. ICML, vol. 1, pp. 577–584, 2001.

     

    [20]    H. Liu, Z. Tao, and Y. Fu, “Partition level constrained clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 10, pp. 2469–2483, 2017.

     

    [21]    J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-means clustering algorithm,” Comput. Geosci., vol. 10, no. 2–3, pp. 191–203, 1984.

     

    [22]    J. A. Silva et al., “Data stream clustering: A survey,” ACM Comput. Surv., vol. 46, no. 1, pp. 1–31, 2013.

     

    [23]    J. Gao, J. Li, Z. Zhang, and P. N. Tan, “An incremental data stream clustering algorithm based on dense units detection,” in Proc. PAKDD, Berlin, Heidelberg: Springer, 2005, pp. 420–425.

     

    [24]    S. O. Akinola and O. J. Oyabugbe, “Accuracies and training times of data mining classification algorithms: An empirical comparative study,” J. Softw. Eng. Appl., vol. 8, pp. 470–477, 2015. DOI: 10.4236/jsea.2015.89045.

     

    [25]    J. Ramadhan et al., “Comparison study using ARIMA and ANN models for forecasting sugarcane yield,” BIO Web of Conferences, vol. 97, Art. no. 00078, 2024, doi: 10.1051/bioconf/20249700078.

     

    [26]    N. Almusallam et al., “Physics-informed neural networks for solving heat equation in thermal engineering,” International Journal on Technical and Physical Problems of Engineering (IJTPE), vol. 17, no. 1, pp. 375–382, Mar. 2025.

     

    [27]    H. Alkattan and S. Abdullaev, “Monitoring wetlands in Southern Iraq based on Landsat data,” in M. Ksibi et al., Eds., Recent Advances in Environmental Science from the Euro-Mediterranean and Surrounding Regions (3rd ed.): EMCEI 2021, Advances in Science, Technology & Innovation, Cham: Springer, 2024, pp. 1097–1108, doi: 10.1007/978-3-031-43922-3_98.

     

    [28]    J. Ramadhan et al., “Yield forecast of sugarcane using two different techniques in discriminant function analysis,” BIO Web of Conferences, vol. 97, Art. no. 00064, 2024, doi: 10.1051/bioconf/20249700064.

     

    [29]    H. Alkattan, N. R. Abbas, O. A. Adelaja, M. Abotaleb, and G. Ali, “Data mining utilizing various leveled clustering procedures on the position of workers in a data innovation firm,” Mesopotamian Journal of Computer Science, vol. 2024, pp. 104–109, Jul. 2024, doi: 10.58496/MJCSC/2024/008.

     

    [30]    Z. Sayyed, “IoT-based environmental dataset,” Kaggle, [Online]. Available: https://www.kaggle.com/datasets/ziya07/iot-based-environmental-dataset.

    Cite This Article As :
    Subhi, Ali. , Khalid, Waleed. , M., El-Sayed. , M., Marwa. Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques. Journal of Cybersecurity and Information Management, vol. , no. , 2026, pp. 10-20. DOI: https://doi.org/10.54216/JCIM.170102
    Subhi, A. Khalid, W. M., E. M., M. (2026). Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques. Journal of Cybersecurity and Information Management, (), 10-20. DOI: https://doi.org/10.54216/JCIM.170102
    Subhi, Ali. Khalid, Waleed. M., El-Sayed. M., Marwa. Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques. Journal of Cybersecurity and Information Management , no. (2026): 10-20. DOI: https://doi.org/10.54216/JCIM.170102
    Subhi, A. , Khalid, W. , M., E. , M., M. (2026) . Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques. Journal of Cybersecurity and Information Management , () , 10-20 . DOI: https://doi.org/10.54216/JCIM.170102
    Subhi A. , Khalid W. , M. E. , M. M. [2026]. Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques. Journal of Cybersecurity and Information Management. (): 10-20. DOI: https://doi.org/10.54216/JCIM.170102
    Subhi, A. Khalid, W. M., E. M., M. "Clustering and Classification of IoT-Based Environmental Data Using Machine Learning Techniques," Journal of Cybersecurity and Information Management, vol. , no. , pp. 10-20, 2026. DOI: https://doi.org/10.54216/JCIM.170102