Supervised Machine Learning Algorithms for Equity Market

Regime Classification: A Systematic Literature Review of

Comparative Performance, Feature Engineering, and

Generalizability (2015–2024)

Suvonkulov Abdulaziz1,* Eugene Q. Castro1

1 Department of Computer Science, Central Asian University, Tashkent, Uzbekistan

Emails: 220456@centralasian.uz · e.castro@centralasian.uz

Received: January 12, 2024 Revised: March 01, 2024 Accepted: June 28, 2024 ⋆ Corresponding author

ABSTRACT

The application of supervised machine learning (ML) algorithms for equity market regime classification has gained

significant attention in recent years. This systematic literature review (SLR) synthesizes findings from 16 peerreviewed

studies published between 2015 and 2024 to address three research questions: (1) How do supervised

ML algorithms (XGBoost, Random Forest, SVM, Neural Networks, Ensemble methods) compare in accuracy,

robustness, and computational efficiency for market regime classification? (2) What feature engineering approaches

are most effective? (3) How generalizable are these models across different equity markets and time periods?

Following PRISMA 2020 guidelines, we searched IEEE Xplore, ScienceDirect, and Springer, identifying 2953

records and including 16 studies after screening. Our findings indicate that ensemble methods (particularly Random

Forest and XGBoost) and deep learning approaches (LSTM, DNN) consistently outperform traditional classifiers.

Technical indicators remain the most common features, though novel approaches including event embeddings,

network centrality measures, and signal decomposition show promise. Generalizability remains a challenge, with

most studies focusing on developed markets. We identify gaps in cross-market validation and interpretability,

providing directions for future research.

Keywords: Systematic literature review Machine learning Stock market prediction Regime classification XGBoost

Random Forest LSTM Deep learning Feature engineering

1. INTRODUCTION

Assessment is one of the most influential parts of education

because it shapes what students practice, what teachers prioritize,

and how learning progress is measured. In higher

education and online learning environments, assessment is

also a major operational burden: instructors must repeatedly

design quizzes, produce alternative versions, write rubrics,

grade responses, and provide feedback, often under time pressure

and with large class sizes. As a result, students may

receive fewer opportunities for practice and delayed feedback,

even though frequent low-stakes assessment is strongly

linked to improved learning outcomes [1].

To address these constraints, digital learning platforms have

increasingly adopted automated and semi-automated approaches

for quiz creation and evaluation. In recent years,

the rise of transformer-based models and large language models

(LLMs) has accelerated this trend by enabling systems

that can generate questions, produce distractors for multiple-