ASPG Menu
search

American Scientific Publishing Group

verified Journal

Fusion: Practice and Applications

ISSN
Online: 2692-4048 Print: 2770-0070
Frequency

Continuous publication

Publication Model

Open access · Articles freely available online · APC applies after acceptance

Fusion: Practice and Applications
Full Length Article

Volume 18Issue 2PP: 200-214 • 2025

An Efficient Learning Approach to Imbalanced Multinomial Classification

Ani Petkova 1* ,
Borislava Toleva 1 ,
Ivan Ivanov 1
1Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski, Sofia 1113, Bulgaria
* Corresponding Author.
Received: July 14, 2024 Revised: October 21, 2024 Accepted: January 06, 2025

Abstract

The presented methodology provides an innovative way to answer a question that is rarely observed in academic literature: How can complex data issues like multiple class imbalance be solved using the available models in a simple and efficient way? In this approach, observations are modeled without additional preprocessing. Several classification models including Random Forest (RF), Support Vector Machines (SVM), and Decision Tree (DT) are utilized for conducting the classification analysis. The parameters of these models and the cross-validation function are adjusted to each individual set of observations. This approach has not been researched in depth. We test it about class imbalance in the target variable. Our results demonstrate the benefits of the proposed method.  First, parameter tuning of ML models can be an effective strategy to handle class imbalance. Second, random shuffling prior to cross validation can be a key to resolving the bias coming from multiclass imbalance. Another important finding is that the best results can be achieved when random shuffling, cross validation and parameter tuning are combined. These findings are key to handling class imbalance in classification. Therefore, this research extends the opportunities to handle class imbalance in a simple, quick, and effective way in cases without adding additional complexity to the model.

Keywords

Multiclass data Classification Parameters adjustment Model evaluation.

References

[1] J. Manyika, M. Chui, B. Brown, and J. Bughin, "Big Data: The Next Frontier for Innovation, Competition," 2011. [Online]. Available: https://www.semanticscholar.org/paper/Big-data%3A-The-next-frontier-for-innovation%2C-and-Manyika/91b63db746becca15090963a8990dfe2b5103799. [Accessed: Oct. 28, 2024].

[2] P. Branco, L. Torgo, and R. P. Ribeiro, "Relevance-based evaluation metrics for multi-class imbalanced domains," in Advances in Knowledge Discovery and Data Mining - 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23–26, 2017, Part I, 2017, pp. 698–710.

[3] M. Koziarski, M. Woźniak, and B. Krawczyk, "Combined Cleaning and Resampling algorithm for multi-class imbalanced data with label noise," Knowledge-Based Systems, vol. 204, p. 106223, 2020.

[4] J. G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R, 2nd ed., Springer, 2021. [Online]. Available: https://www.statlearning.com/. [Accessed: Nov. 20, 2024].

[5] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: synthetic minority over-sampling technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.

[6] H. Han, W.-Y. Wang and B.-H. Mao, "Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning," in Advances in Intelligent Computing. ICIC 2005, D. S. Huang, X. P. Zhang, and G. B. Huang, Eds., Berlin, Heidelberg: Springer, 2005, vol. 3644, pp. 878–887.

[7] S. Maldonado, C. Vairetti, A. Fernandez, and F. Herrera, "FW-SMOTE: A feature-weighted oversampling approach for imbalanced classification," Pattern Recognition, vol. 124, p. 108511, 2022.

[8] A. Dey, "Machine learning algorithms: a review," International Journal of Computer Science and Information Technologies, vol. 7, no. 3, pp. 1174–1179, 2016.

[9] S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, vol. 31, pp. 249–268, 2007.

[10] R. Hansch, Handbook of Random Forests: Theory and Applications for Remote Sensing, Series in Computer Vision, World Scientific, 2021.

[11] S. Rezvani and X. Wang, "A broad review on class imbalance learning techniques," Applied Soft Computing, vol. 143, 2023.

[12] E. S. Agung, A. P. Rifai, and T. Wijayanto, "Image-based facial emotion recognition using convolutional neural network on Emognition dataset," Sci Rep, vol. 14, p. 14429, 2024.

[13] J. Ali, R. Khan, N. Ahmad, and I. Maqsood, "Random Forests and Decision Trees," International Journal of Computer Science, vol. 9, no. 5, pp. 1694–0814, 2012.

[14] M. Wien, H. Schwarz, and T. Oelbaum, "Performance analysis of SVC," IEEE Trans. Circuits Syst. Video Technol., vol. 17, pp. 1194–1203, 2007.

[15] T. Ke, X. Ge, F. Yin, L. Zhang, Y. Zheng, C. Zhang, J. Li, B. Wang, and W. Wang, "A general maximal margin hyper-sphere SVM for multi-class classification," Expert Systems with Applications, vol. 237, p. 121647, 2024.

[16] S. Sridhar and S. Anusuya, "A dual algorithmic approach to deal with multiclass imbalanced classification problems," Big Data Research, vol. 38, p. 100484, 2024.

[17] C.-F. Tsai, K.-C. Chen and W.-C. Lin, "Feature selection and its combination with data over-sampling for multi-class imbalanced datasets," Applied Soft Computing, vol. 153, p. 111267, 2024.

[18] Q. Dai, J.-W. Liu, and Y. Liu, "Multi-granularity relabeled under-sampling algorithm for imbalanced data," Applied Soft Computing, vol. 124, p. 109083, 2022.

[19] S. Shen, Z. Li, Z. Huan, F. Shang, Y. Wang, and Y. Chen, "Neighborhood repartition-based oversampling algorithm for multiclass imbalanced data with label noise," Neurocomputing, vol. 600, p. 128090, 2024.

[20] T. Ma, S. Lu, and C. Jiang, "A membership-based resampling and cleaning algorithm for multi-class imbalanced overlapping data," Expert Systems with Applications, vol. 240, p. 122565, 2024.

[21] E. Ongko and H. Hartono, "Hybrid approach redefinition-multi class with resampling and feature selection for multi-class imbalance with overlapping and noise," Bulletin of Electrical Engineering and Informatics, vol. 10, no. 3, pp. 1718–1728, 2021.

[22] F. Grina, Z. Elouedi, and E. Lefevre, "Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning," International Journal of Approximate Reasoning, vol. 156, pp. 1–15, 2023.

Cite This Article

Choose your preferred format

format_quote
Petkova, Ani, Toleva, Borislava, Ivanov, Ivan. "An Efficient Learning Approach to Imbalanced Multinomial Classification." Fusion: Practice and Applications, vol. Volume 18, no. Issue 2, 2025, pp. 200-214. DOI: https://doi.org/10.54216/FPA.180215
Petkova, A., Toleva, B., Ivanov, I. (2025). An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications, Volume 18(Issue 2), 200-214. DOI: https://doi.org/10.54216/FPA.180215
Petkova, Ani, Toleva, Borislava, Ivanov, Ivan. "An Efficient Learning Approach to Imbalanced Multinomial Classification." Fusion: Practice and Applications Volume 18, no. Issue 2 (2025): 200-214. DOI: https://doi.org/10.54216/FPA.180215
Petkova, A., Toleva, B., Ivanov, I. (2025) 'An Efficient Learning Approach to Imbalanced Multinomial Classification', Fusion: Practice and Applications, Volume 18(Issue 2), pp. 200-214. DOI: https://doi.org/10.54216/FPA.180215
Petkova A, Toleva B, Ivanov I. An Efficient Learning Approach to Imbalanced Multinomial Classification. Fusion: Practice and Applications. 2025;Volume 18(Issue 2):200-214. DOI: https://doi.org/10.54216/FPA.180215
A. Petkova, B. Toleva, I. Ivanov, "An Efficient Learning Approach to Imbalanced Multinomial Classification," Fusion: Practice and Applications, vol. Volume 18, no. Issue 2, pp. 200-214, 2025. DOI: https://doi.org/10.54216/FPA.180215
Digital Archive Ready