Volume 7 , Issue 1 , PP: 67-77, 2024 | Cite this article as | XML | Html | PDF | Full Length Article
Mohamed Ziad Ali 1 * , Abdulrahman Abdullah 2 , Ahmed Mohamed Zaki 3 * , Faris H. Rizk 4 , Marwa M. Eid 5 , Elsayed M. El-Kenway 6
Doi: https://doi.org/10.54216/JAIM.070105
The feature selection area in data analytics is explored through a comprehensive literature review, and the increasing areas that have a data dependency problem and are being resolved with feature selection are highlighted. Review topics of this course cover the foundations to present use cases, for example, cybersecurity, healthcare, and finance. Particularly crucially for the healthcare domain, it reduces the dimensionality and elucidates complex causal links. The further investigation overlaps contemporary techniques, including optimization-based methods, swarm intelligence and algorithms for the diagnosis of heart diseases. The conclusion builds on the practical assessment and underlines research gaps, serving as a basis to set a diversified technological review. This also exhibits new techniques that have released their efficiency in classification environments, for example, hybrid Ant Colony Optimization and the Gray Wolf Optimizer. The ISSA algorithm stands out as a swarm intelligence technique that is best among others. The paper concludes by demonstrating that feature selection goes beyond the preprocessing stage, but it instead stands as a vital part of the fields of machine learning and data science and thus aids the researchers in both retrospective analysis and forthcoming projects.
Feature Selection , Optimization Algorithms , Machine Learning , Artificial Intelligence , Feature Extraction
[1] Amiri, F., Rezaei Yousefi, M., Lucas, C., Shakery, A., & Yazdani, N. (2011). Mutual information-based feature selection for intrusion detection systems. Journal of Network and Computer Applications, 34(4), 1184–1199. https://doi.org/10.1016/j.jnca.2011.01.002
[2] Jaw, E., & Wang, X. (2021). Feature Selection and Ensemble-Based Intrusion Detection System: An Efficient and Comprehensive Approach. Symmetry, 13(10), Article 10. https://doi.org/10.3390/sym13101764
[3] Abudayor, A., & Ufuk, Ö. (2023). A Survey of Feature Selection Strategies for DNA Microarray Classification. Computer Engineering and Intelligent Systems. https://doi.org/10.7176/CEIS/14-2-01
[4] García-Torres, M., Gómez-Vela, F., Melián-Batista, B., & Moreno-Vega, J. M. (2016). High-dimensional feature selection via feature grouping: A Variable Neighborhood Search approach. Information Sciences, 326, 102–118. https://doi.org/10.1016/j.ins.2015.07.041
[5] Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2016). Feature selection for high-dimensional data. Progress in Artificial Intelligence, 5(2), 65–75. https://doi.org/10.1007/s13748-015-0080-y
[6] Taşkın, G., Kaya, H., & Bruzzone, L. (2017). Feature Selection Based on High Dimensional Model Representation for Hyperspectral Images. IEEE Transactions on Image Processing, 26(6), 2918–2928. https://doi.org/10.1109/TIP.2017.2687128
[7] Aljarah, I., Al-Zoubi, A. M., Faris, H., Hassonah, M. A., Mirjalili, S., & Saadeh, H. (2018). Simultaneous Feature Selection and Support Vector Machine Optimization Using the Grasshopper Optimization Algorithm. Cognitive Computation, 10(3), 478–495. https://doi.org/10.1007/s12559-017-9542-9
[8] Lee, M.-C. (2009). Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Systems with Applications, 36(8), 10896–10904. https://doi.org/10.1016/j.eswa.2009.02.038
[9] Zaman, E. A. K., Mohamed, A., & Ahmad, A. (2022). Feature selection for online streaming high-dimensional data: A state-of-the-art review. Applied Soft Computing, 127, 109355. https://doi.org/10.1016/j.asoc.2022.109355
[10] Mathematics | Free Full-Text | Improved Feature Selection Based on Chaos Game Optimization for Social Internet of Things with a Novel Deep Learning Model. (n.d.). Retrieved February 26, 2024, from https://www.mdpi.com/2227-7390/11/4/1032
[11] Aghdam, M. H., Ghasem-Aghaee, N., & Basiri, M. E. (2009). Text feature selection using ant colony optimization. Expert Systems with Applications, 36(3, Part 2), 6843–6853. https://doi.org/10.1016/j.eswa.2008.08.022
[12] Dadaneh, B. Z., Markid, H. Y., & Zakerolhosseini, A. (2016). Unsupervised probabilistic feature selection using ant colony optimization. Expert Systems with Applications, 53, 27–42. https://doi.org/10.1016/j.eswa.2016.01.021
[13] Tabakhi, S., Moradi, P., & Akhlaghian, F. (2014). An unsupervised feature selection algorithm based on ant colony optimization. Engineering Applications of Artificial Intelligence, 32, 112–123. https://doi.org/10.1016/j.engappai.2014.03.007
[14] An Ant Colony Optimization Based Feature Selection for Web Page Classification. (n.d.). Retrieved February 26, 2024, from https://www.hindawi.com/journals/tswj/2014/649260/
[15] Tabakhi, S., & Moradi, P. (2015). Relevance–redundancy feature selection based on ant colony optimization. Pattern Recognition, 48(9), 2798–2811. https://doi.org/10.1016/j.patcog.2015.03.020
[16] Gao, H.-H., Yang, H.-H., & Wang, X.-Y. (2005). Ant colony optimization based network intrusion feature selection and detection. 2005 International Conference on Machine Learning and Cybernetics, 6, 3871-3875 Vol. 6. https://doi.org/10.1109/ICMLC.2005.1527615
[17] Shunmugapriya, P., & Kanmani, S. (2017). A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid). Swarm and Evolutionary Computation, 36, 27–36. https://doi.org/10.1016/j.swevo.2017.04.002
[18] Bio-Inspired Feature Selection Algorithms With Their Applications: A Systematic Literature Review | IEEE Journals & Magazine | IEEE Xplore. (n.d.). Retrieved February 26, 2024, from https://ieeexplore.ieee.org/abstract/document/10114397
[19] Awotunde, J. B., Ajagbe, S. A., & Florez, H. (2024). A Bio-Inspired-Based Salp Swarm Algorithm Enabled with Deep Learning for Alzheimer’s Classification. In H. Florez & M. Leon (Eds.), Applied Informatics (pp. 157–170). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-46813-1_11
[20] Zivkovic, M., Stoean, C., Chhabra, A., Budimirovic, N., Petrovic, A., & Bacanin, N. (2022). Novel Improved Salp Swarm Algorithm: An Application for Feature Selection. Sensors, 22(5), Article 5. https://doi.org/10.3390/s22051711
[21] Balasaraswathi, V. R., Sugumaran, M., & Hamid, Y. (2017). Feature selection techniques for intrusion detection using non-bio-inspired and bio-inspired optimization algorithms. Journal of Communications and Information Networks, 2(4), 107–119. https://doi.org/10.1007/s41650-017-0033-7
[22] OPTIMIZATION OF ATTRIBUTE SELECTION MODEL USING BIO-INSPIRED ALGORITHMS | Journal of Information and Communication Technology. (n.d.). Retrieved February 26, 2024, from https://e-journal.uum.edu.my/index.php/jict/article/view/jict2019.18.1.3
[23] Ali, A., & Gravino, C. (2019). Using Bio-Inspired Features Selection Algorithms in Software Effort Estimation: A Systematic Literature Review. 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 220–227. https://doi.org/10.1109/SEAA.2019.00043
[24] Remeseiro, B., & Bolon-Canedo, V. (2019). A review of feature selection methods in medical applications. Computers in Biology and Medicine, 112, 103375. https://doi.org/10.1016/j.compbiomed.2019.103375
[25] Bolón-Canedo, V., & Alonso-Betanzos, A. (2019). Ensembles for feature selection: A review and future trends. Information Fusion, 52, 1–12. https://doi.org/10.1016/j.inffus.2018.11.008
[26] Rostami, M., Berahmand, K., Nasiri, E., & Forouzandeh, S. (2021). Review of swarm intelligence-based feature selection methods. Engineering Applications of Artificial Intelligence, 100, 104210. https://doi.org/10.1016/j.engappai.2021.104210
[27] Gokulnath, C. B., & Shantharajah, S. P. (2019). An optimized feature selection based on genetic approach and support vector machine for heart disease. Cluster Computing, 22(6), 14777–14787. https://doi.org/10.1007/s10586-018-2416-4
[28] Abdel-Basset, M., El-Shahat, D., El-henawy, I., de Albuquerque, V. H. C., & Mirjalili, S. (2020). A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Systems with Applications, 139, 112824. https://doi.org/10.1016/j.eswa.2019.112824
[29] Nguyen, B. H., Xue, B., & Zhang, M. (2020). A survey on swarm intelligence approaches to feature selection in data mining. Swarm and Evolutionary Computation, 54, 100663. https://doi.org/10.1016/j.swevo.2020.100663
[30] Mohammadi, S., Mirvaziri, H., Ghazizadeh-Ahsaee, M., & Karimipour, H. (2019). Cyber intrusion detection by combined feature selection algorithm. Journal of Information Security and Applications, 44, 80–88. https://doi.org/10.1016/j.jisa.2018.11.007
[31] Shafiq, M., Tian, Z., Bashir, A. K., Du, X., & Guizani, M. (2020). IoT malicious traffic identification using wrapper-based feature selection mechanisms. Computers & Security, 94, 101863. https://doi.org/10.1016/j.cose.2020.101863
[32] Ghosh, M., Guha, R., Sarkar, R., & Abraham, A. (2020). A wrapper-filter feature selection technique based on ant colony optimization. Neural Computing and Applications, 32(12), 7839–7857. https://doi.org/10.1007/s00521-019-04171-3
[33] Hegazy, Ah. E., Makhlouf, M. A., & El-Tawel, Gh. S. (2020). Improved salp swarm algorithm for feature selection. Journal of King Saud University - Computer and Information Sciences, 32(3), 335–344. https://doi.org/10.1016/j.jksuci.2018.06.003
[34] Khafaga, D., El-kenawy, E.-S., Alrowais, F., Kumar, S., Ibrahim, A., & Abdelhamid, A. (2022). Novel Optimized Feature Selection Using Metaheuristics Applied to Physical Benchmark Datasets. Computers, Materials & Continua, 74(2), 4027–4041. https://doi.org/10.32604/cmc.2023.033039
[35] Stańczyk, U. (2015). Feature Evaluation by Filter, Wrapper, and Embedded Approaches. In U. Stańczyk & L. C. Jain (Eds.), Feature Selection for Data and Pattern Recognition (pp. 29–44). Springer. https://doi.org/10.1007/978-3-662-45620-0_3
[36] Alhaq, A., & Al-Shamery, E. (2018). Enhancing Prediction of NASDAQ Stock Market Based on Technical Indicators. 13, 4630–4636.
[37] Chen, G., & Chen, J. (2015). A novel wrapper method for feature selection and its applications. Neurocomputing, 159, 219–226. https://doi.org/10.1016/j.neucom.2015.01.070
[38] Ramírez-Gallego, S., Mouriño-Talín, H., Martínez-Rego, D., Bolón-Canedo, V., Benítez, J. M., Alonso-Betanzos, A., & Herrera, F. (2018). An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 48(9), 1441–1453. https://doi.org/10.1109/TSMC.2017.2670926
[39] Kampa, K., Mehta, S., Chou, C. A., Chaovalitwongse, W. A., & Grabowski, T. J. (2014). Sparse optimization in feature selection: Application in neuroimaging. Journal of Global Optimization, 59(2), 439–457. https://doi.org/10.1007/s10898-013-0134-2
[40] García-Nieto, J., Alba, E., Jourdan, L., & Talbi, E. (2009). Sensitivity and specificity based multiobjective approach for feature selection: Application to cancer diagnosis. Information Processing Letters, 109(16), 887–896. https://doi.org/10.1016/j.ipl.2009.03.029