Stock market direction prediction becomes an essential task in the business sector. The inherent volatile behavior of stock markets worldwide makes the prediction process difficult. The improvement in the prediction accuracy of the stock market direction prediction helps to avoid the risks involved in the investment process. In this aspect, this study designs a swallow swarm optimization (SSO) with a fuzzy support vector machine (FSVM) model for stock market direction prediction. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of the SSO algorithm to fine-tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. To validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets. The experimental outcomes reported the betterment of the SSO-FSVM model over the recent approaches in terms of several evaluation metrics.

Keywords: Machine learning, Stock market, Prediction model, Fuzzy SVM, Classification, Feature extraction.

1. Introduction

The stock market is one of the significant fields that financial backers are committed to, subsequently, stock market price pattern forecast is consistently an interesting issue for specialists from both monetary and specialized spaces [1]. In this examination, our goal is to construct a condition of-craftsmanship expectation model for price pattern forecast, which centres around momentary price pattern expectation. The high return of the stock market has drawn in most financial backers, making stock speculation become one of the most well-known methods of venture and monetary administration [2-4]. To stay away from the high danger that accompanies exceptional yields, financial backers are in an indefatigable quest for exact examination and forecast of the stock market. The measure of exchanging information produced by day by day stock exchange is considered to mirror the real circumstance of the market, which is regularly utilized by financial backers to break down and estimate the market. Notwithstanding, because the stock market is influenced by many factors, for example, market interest, swapping scale, organization working conditions, strategy changes, and market loan costs, stock prices show the attributes of variance, which makes the examination and forecast of stock prices [5] face extraordinary troubles.

As a perceived complex unique framework, the stock market has many affecting elements [6], like nonstationary, nonlinearity, high commotion, and long memory. It is hard to clarify it essentially through numerical models. Subsequently, the examination and forecast of the stock market have been an extremely difficult occupation since an extremely lengthy timespan. The irregular walk hypothesis [7] accepts that the variance of stock prices is arbitrary, and there is no standard to keep. Notwithstanding, numerous analysts have tracked down a specific guideline in the change of stock prices, which shows that the stock market has its particular standards of activity, which establishes the framework for a stock price forecast. Conventional stock market determining strategies incorporate basic examination, specialized investigation, numerous relapses, and autoregressive moving interpretation technique (ARIMA) [8]. The above strategies are normally utilized for the recreation examination of fixed or straight time series. The furthest reaches of the information presented is little, and the information needs pre-processing, for example, a contrast to smooth the nonstationary series. Thusly, customary examination and estimating strategies have specific distresses in stock market anticipating [8].

Since the year 2000, the ascent of machines has turned into a major change in the monetary markets. The PC presently rules the exchanging movement that used to be finished by a human. All the more explicitly, the merchants use calculations to gain data and settle on exchanging choices at the speed of light. There are different calculations utilized by dealers [9, 10]. Some utilise a market creator technique to offer liquidity to different merchants by posting the bid and deal in the market. Some foresee price change by learning the price change in the market. An endeavour to produce an overabundance return has changed from framing a portfolio utilizing a specific technique to anticipating the future price way. These days, AI (ML) procedures assume a fundamental part as a centre calculation to foresee the future stock price.

In [11], a new method has been presented with the graph concept. This method influences Spatio-temporal relationships data among distinct stocks by modelling the stock market as a composite network. The graph-based method is utilized 2 methods for creating 2 hybrid approaches. The 2 dissimilar classes of the graph are created, one from the relation of the past stock values and another is a causation-based graph created from the financial news. Parmar et al. [12] aim at using Regression and LSTM based ML methods for predicting stock prices. Considering the factor related to high, open, close, low, and volume. Chiong et al. [13] suggest an SA-based algorithm for financial market predictions with news exposes. Precisely, the SA method is performed in the pre-processing stage for extracting sentiments correlated features from financial news. Past stock market information from the perception of time series analyses is involved as input features. Utilizing the extracted feature, they utilize an SVM to construct the predictive method, using its parameter optimized by the PSO model.

In [14], With the SA model on the twitters gathered by the Tweet API as well as the ultimate prices of several stocks, search for building an architecture that predicts the stock values of distinct corporations. This predictive model will significantly assist possible stock investors in making an informed decision that will straightforwardly contribute to their revenues. [15] focuses on building a methodology with RNN and LSTM models for predicting upcoming stock market prices. The primary goal of this study is to understand where the accuracy of the ML method could forecast and how many epochs could enhance this method. Pathak and Shetty [16] focus on combining various models with a strong predictive method that could manage different situations where investment could be advantageous. Present methods such as SA or NN approaches are very narrow and lead to wrong results for different conditions. By integrating this method, the predictive method could offer flexible and accurate suggestions. Embedding technical indicators would direct the investors to reduce the risks.

Ampomah et al. [17] try to fill the gap by estimating the performances of the GNB procedure once integrated to distinct feature extraction and feature scaling procedures in stock value predictions. The efficiency of the GNB method set-up is graded by Kendall's test of concordance for the numerous assessment metrics. The result shows that the prediction method-based incorporation of GNB and LDA method outperforms each other methods. Kumar and Acharya [18] deliberate 2 unsupervised learning systems and 5 supervised learning methods to resolve the issue of stock value predictions and has related the performance of each method. Amongst the supervised learning algorithm, LSTM techniques accomplished better than another method however, between the unsupervised learning methods, RBM methodology achieved enhanced performance. Madeeh et al. [19] aim to construct a robust algorithm for stock market predictions. The process included 3 phases, the initial phase includes pre-processing for the stock market dataset, next period includes applying 2 from supervised ML methods viz. KNN and RF, lastly, the assessment phase of efficiency and accuracy of the predictions for the 2 presented algorithms.

This study designs a swallow swarm optimization (SSO) with a fuzzy support vector machine (FSVM) model for stock market direction prediction. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of the SSO algorithm to fine-tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. To validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets.

2. The Proposed SSO-FSVM model

2.1 Preprocessing

The advanced smoothing provides increased weights [20] and the statistical series of

can be computed in an iterated way by the use of Eq. (1):

where

denotes the smoothing factor and

. In addition, the target can be identified in the

day which can be computed using Eq. (2):

2.2 Feature extraction process

In this study, the stock’s closing prices are treated and gathered the measures previously. Therefore, the input data can be defined by

date,

. The data comprises of few detector modules that can be measured, as shown in Fig. 1.

2.3 FSVM based Classification

The extracted features are fed into the FSVM model to classify the input stock market data. The preprocessed data is fed to the FSVM classifiers for determining and allocating appropriate class labels [21]. To accomplish the classification method, deliberate a dataset as

through

makes an

dimension input vectors and

the corresponding class labels, since the mapping function

. It can be expressed by:

Using ω makes the parameter matrixes, and

represents the constant. The best separation hyperplanes using the highest class margin achieved through resolving the succeeding optimization issue from (14) [20]:

Though the higher‐dimension feature space considers as

of the kernel functions, the dataset is uncommonly linearly distinguishable. To solve this problem, the slack variable is determined as the initial optimization problem

whereas

indicates the slack parameter signify the classification errors,

makes the trade-off between the flatness of misclassification error and separation hyperplanes, also

represent the fuzzy membership value reflects to properly classify the corresponding data points. A larger fuzzy membership value implies maximal implication to categorize properly the equivalent data point. The smaller the fuzzy membership value, the smaller effects of equivalent data points of best separation hyperplanes. Fig. 2 demonstrates the architecture of the SVM hyperplane.

while

indicates the standardization variable from FSVM, after the misclassification cost to data points

. Therefore, assigning comparatively smaller implications to the precise classification of dissimilar information points such a noise and outliers, the FSVM is reaching a strong hyperplane. The optimization problem of FSVM from (15) couldn't be solved straightforwardly; therefore, their dual Lagrange problems are developed by:

makes the Lagrange multiplier and

indicates the kernel functions, e.g., the inner products of feature vectors

and

from the feature space

. Through resolving the KTT conditions the separation hyperplanes are formulated by:

To this end, the decision functions of the FSVM technique has been illustrated by:

In the fuzzy membership function calculation, the misclassification cost

is severe to classification efficacy. For decreasing the result of class imbalance. In order to discriminate the class implication of changed trained data points. Consider

and

represent the separation fuzzy membership value of minimum data points

and maximum data point

, correspondingly. In the FSVM‐CIL technique projected from Eq. (6), the fuzzy membership function as

The output value of function

makes from the range of zero and one demonstrates the implication of

their single class. The values of

and

are smaller the effects of class imbalance problems using settings

. Once

with

makes the quantity of minimum data points,

suggests the common data point, and

to imbalanced datasets. After the noise occurs from the boundary region, it is more dangerous when compared to noise in another area. At first, the noise on region distant in the edge was purely identifiable and controls to be well-organized, however, the edge noise could diverge in the classification hyperplanes. The functions

is gathered of 2 portions and is adapted as

, using

makes the distance measures and

indicates the fuzzy function which map

distance as to fuzzy value amongst zero and one. After, appropriate variations to present new fuzzy functions and distance measures are wide-ranging.

2.4 SSO based Parameter Optimization Process

For optimally adjusting the parameters of the FSVM model, the SSO algorithm is employed. The SSO method simulated as the collective efforts of swallow and the interface amid flock members have received better outcome. This system has been expected a metaheuristic algorithm dependent upon special properties of swallows are containing intelligent social relation, fast flight, and hunting skill [22]. Therefore, this procedure has a similar PSO however it is exclusive characteristics couldn’t be initiated in similar systems are comprising the use of 3 classes of particles: Leader Particles

, Explorer Particles

, and Aimless Particles

, which have specific accountability. The

particles are accountable for seeking the problem space. It is implemented this efficiency in the control of the number of variables [24]. This article utilizes the succeeding formula to continue and explore the path:

Eqs. (12),

computes the acceleration coefficient parameter

which straightforwardly affects a single understanding of every particle.

Eqs. (14) and (15) calculate the acceleration coefficient parameter

which straightforwardly effects the integrated knowledges of every particle. In fact, this 2 acceleration coefficients are enumerated based on particle related to global leader and individual knowledge. The particle

utilizes the succeeding Eq. (16) for arbitrary movements:

In the SSO method, there are 2 classes of leaders: the global and local leaders. The particle is separated into different classes. The particle in every class is often similar. Next, a better particle amid the local leaders is chosen and is so-called as a global leader. Subsequently, a better particle in every class is chosen and is termed as a local leader. The particle variation and convergence depend on the position of this article.

3. Results and Discussion

To inspect the improved outcomes of the SSO-FSVM model, a comprehensive set of simulations were carried out on two datasets namely Facebook (FB) and Apple (APPL). Table 1 offers a brief results analysis of the SSO-FSVM model on the two test dataset under distinct trading windows.

Company Name	Trading Window	Accuracy	Recall	Precision	Specificity	F-Score
AAPL Stock	3	69.10	75.20	71.20	61.20	71.20
	5	76.96	83.20	77.20	60.20	78.20
	10	81.35	86.20	86.20	80.20	85.20
	15	86.10	88.20	87.20	81.20	87.20
	30	89.03	91.20	90.20	84.20	90.20
	60	93.68	97.20	95.20	90.20	96.20
	90	97.43	99.20	97.20	94.20	97.20
FB Stock	3	70.64	77.20	71.20	67.20	76.20
	5	77.60	89.20	73.20	65.20	81.20
	10	84.70	95.20	85.20	73.20	88.20
	15	89.42	94.20	94.20	85.20	93.20
	30	91.67	99.20	96.20	86.20	97.20
	60	91.47	100.20	93.20	64.20	98.20
	90	98.34	100.20	100.20	76.20	100.20

Fig. 3 provides a brief

analysis of the SSO-FSVM model on the two datasets. The figure shows that the SSO-FSVM model has gained increased

on the applied two datasets. For instance, with TW of 3, the SSO-FSVM model has offered

of 69.10% and 70.64% on the test AAPL stock and FB datasets. Likewise, with TW of 10, the SSO-FSVM model has attained

of 81.35% and 84.70% on the test AAPL stock and FB datasets. Similarly, with TW of 30, the SSO-FSVM model has achieved

of 89.03% and 91.67% on the test AAPL stock and FB datasets. Also, with TW of 90, the SSO-FSVM model has accomplished

of 97.43% and 98.34% on the test AAPL stock and FB datasets.

Fig. 4 demonstrates the

and

analysis of the SSO-FSVM model on the test AAPL Stock dataset. The figure shows that the SSO-FSVM model has accomplished effective outcomes under varying TW. For instance, under TW-3, the SSO-FSVM model has resulted to

and

of 71.20% and 75.20%. In addition, under TW-5, the SSO-FSVM model has attained to

and

of 77.20% and 83.20%. Along with that, under TW-10, the SSO-FSVM model has reached a

and

of 86.20% and 86.20%. Lastly, under TW-15, the SSO-FSVM model has accomplished

and

of 87.20% and 88.20%.

Fig. 5 establishes the

and

analysis of the SSO-FSVM model on the test AAPL Stock dataset. The figure revealed that the SSO-FSVM model has resulted in improved performance under varying TW. For instance, under TW-3, the SSO-FSVM model has attained to

and

of 61.20% and 71.20%. In addition, under TW-5, the SSO-FSVM model has provided to

and

of 60.20% and 78.20%. Along with that, under TW-10, the SSO-FSVM model has resulted in a

and

of 80.20% and 85.20%. Lastly, under TW-15, the SSO-FSVM model has gained

of 81.20% and 87.20%.

Fig. 6 demonstrates the

and

analysis of the SSO-FSVM model on the test FB dataset. The figure shown that the SSO-FSVM model has accomplished effective outcomes under varying TW. For instance, under TW-3, the SSO-FSVM model has resulted to

and

of 71.20% and 77.20%. In addition, under TW-5, the SSO-FSVM model has attained to

and

of 73.20% and 89.20%. Along with that, under TW-10, the SSO-FSVM model has reached a

and

of 85.20% and 95.20%. Lastly, under TW-15, the SSO-FSVM model has accomplished

and

of 94.20% and 94.20%.

Fig. 7 establishes the

and

analysis of the SSO-FSVM model on the test FB Stock dataset. The figure revealed that the SSO-FSVM model has resulted in improved performance under varying TW. For instance, under TW-3, the SSO-FSVM model has attained to

and

of 67.20% and 76.20%. In addition, under TW-5, the SSO-FSVM model has provided to

and

of 65.20% and 81.20%. Along with that, under TW-10, the SSO-FSVM model has resulted to a

and

of 73.20% and 88.20%. Lastly, under TW-15, the SSO-FSVM model has gained

of 85.20% and 93.20%.

Finally, a detailed comparative

analysis of the SSO-FSVM model takes place in Table 2 and Fig. 8. The results portrayed that the LR and SVM models have obtained lower

of 0.550 and 0.580 respectively. In addition, the ANN and XGBoost models have attained moderate

of 0.720 and 0.830 respectively. Followed by, the WWO-MKELM, BA-XGB, and RF techniques have accomplished reasonably

of 0.971, 0.964, and 0.920 respectively. However, the proposed SSO-FSVM model has accomplished superior outcome with the maximum

of 0.983. Therefore, the SSO-FSVM model is found to be an effective tool to forecast stock market trends.

4. Conclusion

In this study, an effective SSO-FSVM model has been presented to forecast the direction of stock markets. The proposed SSO-FSVM model encompasses preprocessing, feature extraction, FSVM, and SSO based parameter tuning. The usage of SSO algorithm to fine tune the parameters involved in the FSVM model helps to significantly improve the overall predictive performance. In order to validate the improved performance of the SSO-FSVM model, a wide range of experiments were carried out using two benchmark datasets. The experimental outcomes reported the betterment of the SSO-FSVM model over the recent approaches interms of several evaluation metrics. As a part of future scope, the predictive efficiency of the SSO-FSVM model can be boosted by the use of feature selection and reduction approaches.

References

[1] Y. Yuniningsih, S. Widodo, and M. B. N. Wajdi, “An analysis of decision making in the stock investment,” Economic Times: Journal of Economic and Islamic Law, vol. 8, no. 2, pp. 122–128, 2017.

[2] L. Kengatharan and N. Kengatharan, “The influence of behavioral factors in making investment decisions and performance: study on investors of colombo stock exchange, Sri Lanka,” Asian Journal of Finance & Accounting, vol. 6, no. 1, p. 1, 2014.

[3] R. Batra and S. M. Daudpota, “Integrating stocktwits with sentiment analysis for better prediction of stock price movement,” in Proceedings of the 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), pp. 1–5, IEEE, Sukkur, Pakistan, March 2018.

[4] S. C. Nayak, B. B. Misra, and H. S. Behera, “ACFLN: artificial chemical functional link network for prediction of stock market index,” Evolving Systems, vol. 10, no. 4, pp. 567–592, 2019.

[5] S. Rehman, I. U. Chhapra, M. Kashif, and R. Rehan, “Are stock prices a random walk? an empirical evidence of asian stock markets,” ETIKONOMI, vol. 17, no. 2, pp. 237–252, 2018.

[6] S. Carta, A. Ferreira, A. S. Podda, R. D. Reforgiato, and A. Sanna, “Multi-DQN: an ensemble of deep Q-learning agents for stock market forecasting,” Expert Systems with Applications, vol. 164, Article ID 113820, 2021.

[7] R. Efendi, N. Arbaiy, and M. M. Deris, “A new procedure in stock market forecasting based on fuzzy random auto-regression time series model,” Information Sciences, vol. 441, pp. 113–132, 2018.

[8] S. P. Chatzis, V. Siakoulis, A. Petropoulos, E. Stavroulakis, and N. Vlachogiannakis, “Forecasting stock market crisis events using deep and statistical machine learning techniques,” Expert Systems with Applications, vol. 112, pp. 353–371, 2018.

[9] R. Xiong, E. P. Nichols, and Y. Shen, “Deep learning stock volatility with google domestic trends,” 2015.

[10] Yan, Y. and Yang, D., 2021. A stock trend forecast algorithm based on deep neural networks. Scientific Programming, 2021.

[11] Patil, P., Wu, C.S.M., Potika, K. and Orang, M., 2020, January. Stock market prediction using ensemble of graph theory, machine learning and deep learning models. In Proceedings of the 3rd International Conference on Software Engineering and Information Management (pp. 85-92).

[12] Parmar, I., Agarwal, N., Saxena, S., Arora, R., Gupta, S., Dhiman, H. and Chouhan, L., 2018, December. Stock market prediction using machine learning. In 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) (pp. 574-576). IEEE.

[13] Chiong, R., Fan, Z., Hu, Z., Adam, M.T., Lutz, B. and Neumann, D., 2018, July. A sentiment analysis-based machine learning approach for financial market prediction via news disclosures. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 278-279).

[14] Mankar, T., Hotchandani, T., Madhwani, M., Chidrawar, A. and Lifna, C.S., 2018, January. Stock market prediction based on social sentiments using machine learning. In 2018 International Conference on Smart City and Emerging Technology (ICSCET) (pp. 1-3). IEEE.

[15] Moghar, A. and Hamiche, M., 2020. Stock market prediction using LSTM recurrent neural network. Procedia Computer Science, 170, pp.1168-1173.

[16] Pathak, A. and Shetty, N.P., 2019. Indian stock market prediction using machine learning and sentiment analysis. In Computational Intelligence in Data Mining (pp. 595-603). Springer, Singapore.

[17] Ampomah, E.K., Nyame, G., Qin, Z., Addo, P.C., Gyamfi, E.O. and Gyan, M., 2021. Stock Market Prediction with Gaussian Naïve Bayes Machine Learning Algorithm. Informatica, 45(2).

[18] Kumar, S. and Acharya, S., 2020. Application of machine learning algorithms in stock market prediction: a comparative analysis. In Handbook of Research on Smart Technology Models for Business and Industry (pp. 153-180). IGI Global.

[19] Madeeh, O.D. and Abdullah, H.S., 2021, February. An efficient prediction model based on machine learning techniques for prediction of the stock market. In Journal of Physics: Conference Series (Vol. 1804, No. 1, p. 012008). IOP Publishing.

[20] Jeyakarthic, M. and Punitha, S., 2020. An effective stock market direction prediction model using water wave optimization with multi-kernel extreme learning machine. IIOAB J, 11, pp.103-109.

[21] Zhou, M., Zhao, Q. and Chen, Y., 2019. Endpoint prediction of BOF by flame spectrum and furnace mouth image based on fuzzy support vector machine. Optik, 178, pp.575-581.

[22] Neshat, M., Sepidnam, G. and Sargolzaei, M., 2013. Swallow swarm optimization algorithm: a new method to optimization. Neural Computing and Applications, 23(2), pp.429-454.

[23] Basak, S., Kar, S., Saha, S., Khaidem, L. and Dey, S.R., 2019. Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance, 47, pp.552-567.

Methods	Accuracy
Proposed Method	0.983
WWO-MKELM	0.971
BA-XGB	0.964
XGBOOST	0.830
RF	0.920
LR	0.550
SVM	0.580
ANN	0.720