An Efficient Learning Approach to Imbalanced Multinomial Classification

Ani Petkova; Borislava Toleva; Ivan Ivanov

doi:https://doi.org/10.54216/FPA.180215

An Efficient Learning Approach to Imbalanced Multinomial Classification

Ani Petkova ¹ , Borislava Toleva ² , Ivan Ivanov ^{3
*}

1 Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski, Sofia 1113, Bulgaria - (ani.petkova.99@gmail.com)

2 Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski, Sofia 1113, Bulgaria - (bvrigazova@gmail.com)

3 Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski, Sofia 1113, Bulgaria - (i_ivanov@feb.uni-sofia.bg)

Doi: https://doi.org/10.54216/FPA.180215

Received: July 14, 2024 Revised: October 21, 2024 Accepted: January 06, 2025

Abstract

The presented methodology provides an innovative way to answer a question that is rarely observed in academic literature: How can complex data issues like multiple class imbalance be solved using the available models in a simple and efficient way? In this approach, observations are modeled without additional preprocessing. Several classification models including Random Forest (RF), Support Vector Machines (SVM), and Decision Tree (DT) are utilized for conducting the classification analysis. The parameters of these models and the cross-validation function are adjusted to each individual set of observations. This approach has not been researched in depth. We test it about class imbalance in the target variable. Our results demonstrate the benefits of the proposed method. First, parameter tuning of ML models can be an effective strategy to handle class imbalance. Second, random shuffling prior to cross validation can be a key to resolving the bias coming from multiclass imbalance. Another important finding is that the best results can be achieved when random shuffling, cross validation and parameter tuning are combined. These findings are key to handling class imbalance in classification. Therefore, this research extends the opportunities to handle class imbalance in a simple, quick, and effective way in cases without adding additional complexity to the model.

Keywords :

Multiclass data , Classification , Parameters adjustment , Model evaluation.

References

[1] J. Manyika, M. Chui, B. Brown, and J. Bughin, "Big Data: The Next Frontier for Innovation, Competition," 2011. [Online]. Available: https://www.semanticscholar.org/paper/Big-data%3A-The-next-frontier-for-innovation%2C-and-Manyika/91b63db746becca15090963a8990dfe2b5103799. [Accessed: Oct. 28, 2024].

[2] P. Branco, L. Torgo, and R. P. Ribeiro, "Relevance-based evaluation metrics for multi-class imbalanced domains," in Advances in Knowledge Discovery and Data Mining - 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23–26, 2017, Part I, 2017, pp. 698–710.

[3] M. Koziarski, M. Woźniak, and B. Krawczyk, "Combined Cleaning and Resampling algorithm for multi-class imbalanced data with label noise," Knowledge-Based Systems, vol. 204, p. 106223, 2020.

[4] J. G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R, 2nd ed., Springer, 2021. [Online]. Available: https://www.statlearning.com/. [Accessed: Nov. 20, 2024].

[5] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: synthetic minority over-sampling technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.

[6] H. Han, W.-Y. Wang and B.-H. Mao, "Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning," in Advances in Intelligent Computing. ICIC 2005, D. S. Huang, X. P. Zhang, and G. B. Huang, Eds., Berlin, Heidelberg: Springer, 2005, vol. 3644, pp. 878–887.

[7] S. Maldonado, C. Vairetti, A. Fernandez, and F. Herrera, "FW-SMOTE: A feature-weighted oversampling approach for imbalanced classification," Pattern Recognition, vol. 124, p. 108511, 2022.

[8] A. Dey, "Machine learning algorithms: a review," International Journal of Computer Science and Information Technologies, vol. 7, no. 3, pp. 1174–1179, 2016.

[9] S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, vol. 31, pp. 249–268, 2007.

[10] R. Hansch, Handbook of Random Forests: Theory and Applications for Remote Sensing, Series in Computer Vision, World Scientific, 2021.

[11] S. Rezvani and X. Wang, "A broad review on class imbalance learning techniques," Applied Soft Computing, vol. 143, 2023.

[12] E. S. Agung, A. P. Rifai, and T. Wijayanto, "Image-based facial emotion recognition using convolutional neural network on Emognition dataset," Sci Rep, vol. 14, p. 14429, 2024.

[13] J. Ali, R. Khan, N. Ahmad, and I. Maqsood, "Random Forests and Decision Trees," International Journal of Computer Science, vol. 9, no. 5, pp. 1694–0814, 2012.

[14] M. Wien, H. Schwarz, and T. Oelbaum, "Performance analysis of SVC," IEEE Trans. Circuits Syst. Video Technol., vol. 17, pp. 1194–1203, 2007.

[15] T. Ke, X. Ge, F. Yin, L. Zhang, Y. Zheng, C. Zhang, J. Li, B. Wang, and W. Wang, "A general maximal margin hyper-sphere SVM for multi-class classification," Expert Systems with Applications, vol. 237, p. 121647, 2024.

[16] S. Sridhar and S. Anusuya, "A dual algorithmic approach to deal with multiclass imbalanced classification problems," Big Data Research, vol. 38, p. 100484, 2024.

[17] C.-F. Tsai, K.-C. Chen and W.-C. Lin, "Feature selection and its combination with data over-sampling for multi-class imbalanced datasets," Applied Soft Computing, vol. 153, p. 111267, 2024.

[18] Q. Dai, J.-W. Liu, and Y. Liu, "Multi-granularity relabeled under-sampling algorithm for imbalanced data," Applied Soft Computing, vol. 124, p. 109083, 2022.

[19] S. Shen, Z. Li, Z. Huan, F. Shang, Y. Wang, and Y. Chen, "Neighborhood repartition-based oversampling algorithm for multiclass imbalanced data with label noise," Neurocomputing, vol. 600, p. 128090, 2024.

[20] T. Ma, S. Lu, and C. Jiang, "A membership-based resampling and cleaning algorithm for multi-class imbalanced overlapping data," Expert Systems with Applications, vol. 240, p. 122565, 2024.

[21] E. Ongko and H. Hartono, "Hybrid approach redefinition-multi class with resampling and feature selection for multi-class imbalance with overlapping and noise," Bulletin of Electrical Engineering and Informatics, vol. 10, no. 3, pp. 1718–1728, 2021.

[22] F. Grina, Z. Elouedi, and E. Lefevre, "Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning," International Journal of Approximate Reasoning, vol. 156, pp. 1–15, 2023.

Cite This Article As :

Petkova, Ani. , Toleva, Borislava. , Ivanov, Ivan. An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications, vol. , no. , 2025, pp. 200-214. DOI: https://doi.org/10.54216/FPA.180215

Petkova, A. Toleva, B. Ivanov, I. (2025). An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications, (), 200-214. DOI: https://doi.org/10.54216/FPA.180215

Petkova, Ani. Toleva, Borislava. Ivanov, Ivan. An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications , no. (2025): 200-214. DOI: https://doi.org/10.54216/FPA.180215

Petkova, A. , Toleva, B. , Ivanov, I. (2025) . An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications , () , 200-214 . DOI: https://doi.org/10.54216/FPA.180215

Petkova A. , Toleva B. , Ivanov I. [2025]. An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications. (): 200-214. DOI: https://doi.org/10.54216/FPA.180215

Petkova, A. Toleva, B. Ivanov, I. "An Efficient Learning Approach to Imbalanced Multinomial Classification," Fusion: Practice and Applications, vol. , no. , pp. 200-214, 2025. DOI: https://doi.org/10.54216/FPA.180215

Fusion: Practice and Applications

Journal DOI

Journal Menu

Journal Volumes

Volume 1

Volume 2

Volume 3

Volume 4

Volume 5

Volume 6

Volume 7

Volume 8

Volume 9

Volume 10

Volume 11

Volume 12

Volume 13

Volume 14

Volume 15

Volume 16

Volume 17

Volume 18

Volume 19

Volume 20

Volume 21

An Efficient Learning Approach to Imbalanced Multinomial Classification

Abstract

Keywords :

References

Cite This Article As :

Article Statistics

Download