An Optimization Model for Stock Market Direction Prediction

 

Mingzhong Liu 1, N Metawa2

Hefei University of Technology, China

University of Sharjah, Sharjah, United Arab Emirates

Emails: zjy@htc.edu.cnnmetawa@sharjah.ac.ae

 

Abstract

Stock market direction prediction becomes an essential task in the business sector. The inherent volatile behavior of stock markets worldwide makes the prediction process difficult. The improvement in the prediction accuracy of the stock market direction prediction helps to avoid the risks involved in the investment process. In this aspect, this study designs a swallow swarm optimization (SSO) with a fuzzy support vector machine (FSVM) model for stock market direction prediction. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of the SSO algorithm to fine-tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. To validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets. The experimental outcomes reported the betterment of the SSO-FSVM model over the recent approaches in terms of several evaluation metrics. 

Keywords: Machine learning, Stock market, Prediction model, Fuzzy SVM, Classification, Feature extraction.

1.    Introduction

The stock market is one of the significant fields that financial backers are committed to, subsequently, stock market price pattern forecast is consistently an interesting issue for specialists from both monetary and specialized spaces [1]. In this examination, our goal is to construct a condition of-craftsmanship expectation model for price pattern forecast, which centres around momentary price pattern expectation. The high return of the stock market has drawn in most financial backers, making stock speculation become one of the most well-known methods of venture and monetary administration [2-4]. To stay away from the high danger that accompanies exceptional yields, financial backers are in an indefatigable quest for exact examination and forecast of the stock market. The measure of exchanging information produced by day by day stock exchange is considered to mirror the real circumstance of the market, which is regularly utilized by financial backers to break down and estimate the market. Notwithstanding, because the stock market is influenced by many factors, for example, market interest, swapping scale, organization working conditions, strategy changes, and market loan costs, stock prices show the attributes of variance, which makes the examination and forecast of stock prices [5] face extraordinary troubles. 

As a perceived complex unique framework, the stock market has many affecting elements [6], like nonstationary, nonlinearity, high commotion, and long memory. It is hard to clarify it essentially through numerical models. Subsequently, the examination and forecast of the stock market have been an extremely difficult occupation since an extremely lengthy timespan. The irregular walk hypothesis [7] accepts that the variance of stock prices is arbitrary, and there is no standard to keep. Notwithstanding, numerous analysts have tracked down a specific guideline in the change of stock prices, which shows that the stock market has its particular standards of activity, which establishes the framework for a stock price forecast. Conventional stock market determining strategies incorporate basic examination, specialized investigation, numerous relapses, and autoregressive moving interpretation technique (ARIMA) [8]. The above strategies are normally utilized for the recreation examination of fixed or straight time series. The furthest reaches of the information presented is little, and the information needs pre-processing, for example, a contrast to smooth the nonstationary series. Thusly, customary examination and estimating strategies have specific distresses in stock market anticipating [8]. 

Since the year 2000, the ascent of machines has turned into a major change in the monetary markets. The PC presently rules the exchanging movement that used to be finished by a human. All the more explicitly, the merchants use calculations to gain data and settle on exchanging choices at the speed of light. There are different calculations utilized by dealers [9, 10]. Some utilise a market creator technique to offer liquidity to different merchants by posting the bid and deal in the market. Some foresee price change by learning the price change in the market. An endeavour to produce an overabundance return has changed from framing a portfolio utilizing a specific technique to anticipating the future price way. These days, AI (ML) procedures assume a fundamental part as a centre calculation to foresee the future stock price.

In [11], a new method has been presented with the graph concept. This method influences Spatio-temporal relationships data among distinct stocks by modelling the stock market as a composite network. The graph-based method is utilized 2 methods for creating 2 hybrid approaches. The 2 dissimilar classes of the graph are created, one from the relation of the past stock values and another is a causation-based graph created from the financial news. Parmar et al. [12] aim at using Regression and LSTM based ML methods for predicting stock prices. Considering the factor related to high, open, close, low, and volume. Chiong et al. [13] suggest an SA-based algorithm for financial market predictions with news exposes. Precisely, the SA method is performed in the pre-processing stage for extracting sentiments correlated features from financial news. Past stock market information from the perception of time series analyses is involved as input features. Utilizing the extracted feature, they utilize an SVM to construct the predictive method, using its parameter optimized by the PSO model.

In [14], With the SA model on the twitters gathered by the Tweet API as well as the ultimate prices of several stocks, search for building an architecture that predicts the stock values of distinct corporations. This predictive model will significantly assist possible stock investors in making an informed decision that will straightforwardly contribute to their revenues. [15] focuses on building a methodology with RNN and LSTM models for predicting upcoming stock market prices. The primary goal of this study is to understand where the accuracy of the ML method could forecast and how many epochs could enhance this method. Pathak and Shetty [16] focus on combining various models with a strong predictive method that could manage different situations where investment could be advantageous. Present methods such as SA or NN approaches are very narrow and lead to wrong results for different conditions. By integrating this method, the predictive method could offer flexible and accurate suggestions. Embedding technical indicators would direct the investors to reduce the risks.

Ampomah et al. [17] try to fill the gap by estimating the performances of the GNB procedure once integrated to distinct feature extraction and feature scaling procedures in stock value predictions. The efficiency of the GNB method set-up is graded by Kendall's test of concordance for the numerous assessment metrics. The result shows that the prediction method-based incorporation of GNB and LDA method outperforms each other methods. Kumar and Acharya [18] deliberate 2 unsupervised learning systems and 5 supervised learning methods to resolve the issue of stock value predictions and has related the performance of each method. Amongst the supervised learning algorithm, LSTM techniques accomplished better than another method however, between the unsupervised learning methods, RBM methodology achieved enhanced performance. Madeeh et al. [19] aim to construct a robust algorithm for stock market predictions. The process included 3 phases, the initial phase includes pre-processing for the stock market dataset, next period includes applying 2 from supervised ML methods viz.  KNN and RF, lastly, the assessment phase of efficiency and accuracy of the predictions for the 2 presented algorithms. 

This study designs a swallow swarm optimization (SSO) with a fuzzy support vector machine (FSVM) model for stock market direction prediction. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of the SSO algorithm to fine-tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. To validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets.

2.    The Proposed SSO-FSVM model

In this study, an effective SSO-FSVM model has been presented to forecast the direction of stock markets. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of the SSO algorithm to fine-tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. These processes are elaborated in the succeeding sections.

2.1 Preprocessing

The advanced smoothing provides increased weights [20] and the statistical series of  can be computed in an iterated way by the use of Eq. (1):

for 

where  denotes the smoothing factor and . In addition, the target can be identified in the  day which can be computed using Eq. (2):

where  implies the value of days. 

2.2 Feature extraction process

In this study, the stock’s closing prices are treated and gathered the measures previously. Therefore, the input data can be defined by date. The data comprises of few detector modules that can be measured, as shown in Fig. 1. 

 

Fig. 1. Extracted features 

 

 

 

2.3 FSVM based Classification

The extracted features are fed into the FSVM model to classify the input stock market data. The preprocessed data is fed to the FSVM classifiers for determining and allocating appropriate class labels [21]. To accomplish the classification method, deliberate a dataset as  through  makes an  dimension input vectors and  the corresponding class labels, since the mapping function . It can be expressed by:

Using ω makes the parameter matrixes, and  represents the constant. The best separation hyperplanes using the highest class margin achieved through resolving the succeeding optimization issue from (14) [20]:

Though the higher‐dimension feature space considers as  of the kernel functions, the dataset is uncommonly linearly distinguishable. To solve this problem, the slack variable is determined as the initial optimization problem 

whereas  indicates the slack parameter signify the classification errors,  makes the trade-off between the flatness of misclassification error and separation hyperplanes, also  represent the fuzzy membership value reflects to properly classify the corresponding data points. A larger fuzzy membership value implies maximal implication to categorize properly the equivalent data point. The smaller the fuzzy membership value, the smaller effects of equivalent data points of best separation hyperplanes. Fig. 2 demonstrates the architecture of the SVM hyperplane.

     while  indicates the standardization variable from FSVM, after the misclassification cost to data points  is . Therefore, assigning comparatively smaller implications to the precise classification of dissimilar information points such a noise and outliers, the FSVM is reaching a strong hyperplane. The optimization problem of FSVM from (15) couldn't be solved straightforwardly; therefore, their dual Lagrange problems are developed by:

 makes the Lagrange multiplier and  indicates the kernel functions, e.g., the inner products of feature vectors  and  from the feature space . Through resolving the KTT conditions the separation hyperplanes are formulated by:

To this end, the decision functions of the FSVM technique has been illustrated by:

Fig. 2. SVM Hyperplanes

 

In the fuzzy membership function calculation, the misclassification cost  is severe to classification efficacy. For decreasing the result of class imbalance. In order to discriminate the class implication of changed trained data points. Consider  and  represent the separation fuzzy membership value of minimum data points  and maximum data point , correspondingly. In the FSVM‐CIL technique projected from Eq. (6), the fuzzy membership function as

The output value of function  makes from the range of zero and one demonstrates the implication of  their single class. The values of  and  are smaller the effects of class imbalance problems using settings . Once  with  makes the quantity of minimum data points,  suggests the common data point, and  to imbalanced datasets. After the noise occurs from the boundary region, it is more dangerous when compared to noise in another area. At first, the noise on region distant in the edge was purely identifiable and controls to be well-organized, however, the edge noise could diverge in the classification hyperplanes. The functions  is gathered of 2 portions and is adapted as , using  makes the distance measures and  indicates the fuzzy function which map  distance as to fuzzy value amongst zero and one. After, appropriate variations to present new fuzzy functions and distance measures are wide-ranging.

2.4 SSO based Parameter Optimization Process

For optimally adjusting the parameters of the FSVM model, the SSO algorithm is employed. The SSO method simulated as the collective efforts of swallow and the interface amid flock members have received better outcome. This system has been expected a metaheuristic algorithm dependent upon special properties of swallows are containing intelligent social relation, fast flight, and hunting skill [22]. Therefore, this procedure has a similar PSO however it is exclusive characteristics couldn’t be initiated in similar systems are comprising the use of 3 classes of particles: Leader Particles , Explorer Particles , and Aimless Particles , which have specific accountability. The  particles are accountable for seeking the problem space. It is implemented this efficiency in the control of the number of variables [24].  This article utilizes the succeeding formula to continue and explore the path:

 

Eq. (11) demonstrates the velocity vector in the path of a global leader.

Eqs. (12),  computes the acceleration coefficient parameter  which straightforwardly affects a single understanding of every particle.

 

Eqs. (14) and (15) calculate the acceleration coefficient parameter  which straightforwardly effects the integrated knowledges of every particle. In fact, this 2 acceleration coefficients are enumerated based on particle related to global leader and individual knowledge. The particle  utilizes the succeeding Eq. (16) for arbitrary movements:

In the SSO method, there are 2 classes of leaders: the global and local leaders. The particle is separated into different classes. The particle in every class is often similar. Next, a better particle amid the local leaders is chosen and is so-called as a global leader. Subsequently, a better particle in every class is chosen and is termed as a local leader. The particle variation and convergence depend on the position of this article.

3.    Results and Discussion

To inspect the improved outcomes of the SSO-FSVM model, a comprehensive set of simulations were carried out on two datasets namely Facebook (FB) and Apple (APPL). Table 1 offers a brief results analysis of the SSO-FSVM model on the two test dataset under distinct trading windows. 

 

 

Table 1 Overall classification results analysis of SSO-FSVM model

Company Name

Trading Window

Accuracy

Recall

Precision

Specificity

F-Score

AAPL Stock

3

69.10

75.20

71.20

61.20

71.20

5

76.96

83.20

77.20

60.20

78.20

10

81.35

86.20

86.20

80.20

85.20

15

86.10

88.20

87.20

81.20

87.20

30

89.03

91.20

90.20

84.20

90.20

60

93.68

97.20

95.20

90.20

96.20

90

97.43

99.20

97.20

94.20

97.20

FB Stock

3

70.64

77.20

71.20

67.20

76.20

5

77.60

89.20

73.20

65.20

81.20

10

84.70

95.20

85.20

73.20

88.20

15

89.42

94.20

94.20

85.20

93.20

30

91.67

99.20

96.20

86.20

97.20

60

91.47

100.20

93.20

64.20

98.20

90

98.34

100.20

100.20

76.20

100.20

 

 

AAPL Stock b) FB Stock

Fig. 3.  analysis of SSO-FSVM model

 

Fig. 3 provides a brief  analysis of the SSO-FSVM model on the two datasets. The figure shows that the SSO-FSVM model has gained increased  on the applied two datasets. For instance, with TW of 3, the SSO-FSVM model has offered  of 69.10% and 70.64% on the test AAPL stock and FB datasets. Likewise, with TW of 10, the SSO-FSVM model has attained  of 81.35% and 84.70% on the test AAPL stock and FB datasets. Similarly, with TW of 30, the SSO-FSVM model has achieved  of 89.03% and 91.67% on the test AAPL stock and FB datasets. Also, with TW of 90, the SSO-FSVM model has accomplished  of 97.43% and 98.34% on the test AAPL stock and FB datasets.

Fig. 4 demonstrates the  and  analysis of the SSO-FSVM model on the test AAPL Stock dataset. The figure shows that the SSO-FSVM model has accomplished effective outcomes under varying TW. For instance, under TW-3, the SSO-FSVM model has resulted to  and  of 71.20% and 75.20%. In addition, under TW-5, the SSO-FSVM model has attained to  and  of 77.20% and 83.20%. Along with that, under TW-10, the SSO-FSVM model has reached a  and  of 86.20% and 86.20%.  Lastly, under TW-15, the SSO-FSVM model has accomplished   and  of 87.20% and 88.20%. 

Fig. 5 establishes the  and  analysis of the SSO-FSVM model on the test AAPL Stock dataset. The figure revealed that the SSO-FSVM model has resulted in improved performance under varying TW. For instance, under TW-3, the SSO-FSVM model has attained to  and   of 61.20% and 71.20%. In addition, under TW-5, the SSO-FSVM model has provided to  and  of 60.20% and 78.20%. Along with that, under TW-10, the SSO-FSVM model has resulted in a  and  of 80.20% and 85.20%.  Lastly, under TW-15, the SSO-FSVM model has gained  of 81.20% and 87.20%. 

 

Fig. 4.   and  analysis of SSO-FSVM model on AAPL dataset

Fig. 5.  and   analysis of SSO-FSVM model on AAPL dataset

 

 

FB Stock

Fig. 6.   and  analysis of SSO-FSVM model on FB dataset

 

Fig. 7.  and   analysis of SSO-FSVM model on AAPL dataset

 

Fig. 6 demonstrates the  and  analysis of the SSO-FSVM model on the test FB dataset. The figure shown that the SSO-FSVM model has accomplished effective outcomes under varying TW. For instance, under TW-3, the SSO-FSVM model has resulted to  and  of 71.20% and 77.20%. In addition, under TW-5, the SSO-FSVM model has attained to  and  of 73.20% and 89.20%. Along with that, under TW-10, the SSO-FSVM model has reached a  and  of 85.20% and 95.20%.  Lastly, under TW-15, the SSO-FSVM model has accomplished  and  of 94.20% and 94.20%. 

Fig. 7 establishes the  and  analysis of the SSO-FSVM model on the test FB Stock dataset. The figure revealed that the SSO-FSVM model has resulted in improved performance under varying TW. For instance, under TW-3, the SSO-FSVM model has attained to  and   of 67.20% and 76.20%. In addition, under TW-5, the SSO-FSVM model has provided to  and  of 65.20% and 81.20%. Along with that, under TW-10, the SSO-FSVM model has resulted to a  and  of 73.20% and 88.20%.  Lastly, under TW-15, the SSO-FSVM model has gained  of 85.20% and 93.20%. 

Finally, a detailed comparative  analysis of the SSO-FSVM model takes place in Table 2 and Fig. 8. The results portrayed that the LR and SVM models have obtained lower  of 0.550 and 0.580 respectively. In addition, the ANN and XGBoost models have attained moderate  of 0.720 and 0.830 respectively. Followed by, the WWO-MKELM, BA-XGB, and RF techniques have accomplished reasonably  of 0.971, 0.964, and 0.920 respectively. However, the proposed SSO-FSVM model has accomplished superior outcome with the maximum  of 0.983. Therefore, the SSO-FSVM model is found to be an effective tool to forecast stock market trends. 

 

 

Table 2 Comparative  analysis of SSO-FSVM model with recent methods [23]

Methods

Accuracy

Proposed Method

0.983

WWO-MKELM

0.971

BA-XGB

0.964

XGBOOST

0.830

RF

0.920

LR

0.550

SVM

0.580

ANN

0.720

 

Fig. 8.  analysis of SSO-FSVM with recent methods

 

4.    Conclusion

In this study, an effective SSO-FSVM model has been presented to forecast the direction of stock markets. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of SSO algorithm to fine tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. In order to validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets. The experimental outcomes reported the betterment of the SSO-FSVM model over the recent approaches interms of several evaluation metrics. As a part of future scope, the predictive efficiency of the SSO-FSVM model can be boosted by the use of feature selection and reduction approaches. 

References

[1]      Y. Yuniningsih, S. Widodo, and M. B. N. Wajdi, “An analysis of decision making in the stock investment,” Economic Times: Journal of Economic and Islamic Law, vol. 8, no. 2, pp. 122–128, 2017.

[2]      L. Kengatharan and N. Kengatharan, “The influence of behavioral factors in making investment decisions and performance: study on investors of colombo stock exchange, Sri Lanka,” Asian Journal of Finance & Accounting, vol. 6, no. 1, p. 1, 2014.

[3]      R. Batra and S. M. Daudpota, “Integrating stocktwits with sentiment analysis for better prediction of stock price movement,” in Proceedings of the 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), pp. 1–5, IEEE, Sukkur, Pakistan, March 2018.

[4]      S. C. Nayak, B. B. Misra, and H. S. Behera, “ACFLN: artificial chemical functional link network for prediction of stock market index,” Evolving Systems, vol. 10, no. 4, pp. 567–592, 2019.

[5]      S. Rehman, I. U. Chhapra, M. Kashif, and R. Rehan, “Are stock prices a random walk? an empirical evidence of asian stock markets,” ETIKONOMI, vol. 17, no. 2, pp. 237–252, 2018.

[6]      S. Carta, A. Ferreira, A. S. Podda, R. D. Reforgiato, and A. Sanna, “Multi-DQN: an ensemble of deep Q-learning agents for stock market forecasting,” Expert Systems with Applications, vol. 164, Article ID 113820, 2021.

[7]      R. Efendi, N. Arbaiy, and M. M. Deris, “A new procedure in stock market forecasting based on fuzzy random auto-regression time series model,” Information Sciences, vol. 441, pp. 113–132, 2018.

[8]      S. P. Chatzis, V. Siakoulis, A. Petropoulos, E. Stavroulakis, and N. Vlachogiannakis, “Forecasting stock market crisis events using deep and statistical machine learning techniques,” Expert Systems with Applications, vol. 112, pp. 353–371, 2018.

[9]      R. Xiong, E. P. Nichols, and Y. Shen, “Deep learning stock volatility with google domestic trends,” 2015.

[10]   Yan, Y. and Yang, D., 2021. A stock trend forecast algorithm based on deep neural networks. Scientific Programming, 2021.

[11]   Patil, P., Wu, C.S.M., Potika, K. and Orang, M., 2020, January. Stock market prediction using ensemble of graph theory, machine learning and deep learning models. In Proceedings of the 3rd International Conference on Software Engineering and Information Management (pp. 85-92).

[12]   Parmar, I., Agarwal, N., Saxena, S., Arora, R., Gupta, S., Dhiman, H. and Chouhan, L., 2018, December. Stock market prediction using machine learning. In 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) (pp. 574-576). IEEE.

[13]   Chiong, R., Fan, Z., Hu, Z., Adam, M.T., Lutz, B. and Neumann, D., 2018, July. A sentiment analysis-based machine learning approach for financial market prediction via news disclosures. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 278-279).

[14]   Mankar, T., Hotchandani, T., Madhwani, M., Chidrawar, A. and Lifna, C.S., 2018, January. Stock market prediction based on social sentiments using machine learning. In 2018 International Conference on Smart City and Emerging Technology (ICSCET) (pp. 1-3). IEEE.

[15]   Moghar, A. and Hamiche, M., 2020. Stock market prediction using LSTM recurrent neural network. Procedia Computer Science, 170, pp.1168-1173.

[16]   Pathak, A. and Shetty, N.P., 2019. Indian stock market prediction using machine learning and sentiment analysis. In Computational Intelligence in Data Mining (pp. 595-603). Springer, Singapore. 

[17]   Ampomah, E.K., Nyame, G., Qin, Z., Addo, P.C., Gyamfi, E.O. and Gyan, M., 2021. Stock Market Prediction with Gaussian Naïve Bayes Machine Learning Algorithm. Informatica, 45(2).

[18]   Kumar, S. and Acharya, S., 2020. Application of machine learning algorithms in stock market prediction: a comparative analysis. In Handbook of Research on Smart Technology Models for Business and Industry (pp. 153-180). IGI Global.

[19]   Madeeh, O.D. and Abdullah, H.S., 2021, February. An efficient prediction model based on machine learning techniques for prediction of the stock market. In Journal of Physics: Conference Series (Vol. 1804, No. 1, p. 012008). IOP Publishing.

[20]   Jeyakarthic, M. and Punitha, S., 2020. An effective stock market direction prediction model using water wave optimization with multi-kernel extreme learning machine. IIOAB J, 11, pp.103-109.

[21]   Zhou, M., Zhao, Q. and Chen, Y., 2019. Endpoint prediction of BOF by flame spectrum and furnace mouth image based on fuzzy support vector machine. Optik, 178, pp.575-581.

[22]   Neshat, M., Sepidnam, G. and Sargolzaei, M., 2013. Swallow swarm optimization algorithm: a new method to optimization. Neural Computing and Applications, 23(2), pp.429-454.

[23]   Basak, S., Kar, S., Saha, S., Khaidem, L. and Dey, S.R., 2019. Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance, 47, pp.552-567.