Stock market prediction is an incredibly challenging problem due to the sheer number of factors to be considered and due to the fact that many of these being time dependent. We have experimented with a few Statistical and Machine learning approaches for obtaining better prediction for the close price of AMD shares data. Listed below are the models.
- ARIMA Model
- LSTM Approach
- Encoder Decoder LSTM Approach
- Encoder Decoder LSTM Approach with Net Cross Validation Method
Let’s discuss briefly what we learnt about performance of some of these in predicting stock market close price before having a lengthy discussion Encoder Decoder LSTM Approach.
When we looked at the original daily close price series for last two years (from 2017 to 2018) as depicted in Figure 2 graph, clear seasonal or trend patterns cannot be captured. Also, stock prices mainly depend on current market status and thus it may fall or rise. After developing ARIMA model for AMD stock market data, we observed that the predictions are not always providing good results even though valid train and test models were provided from historical data.
Next our focus was to experiment different machine learning models. Initially, the LSTM approach was applied to the AMD Data. It is known as Long Short-Term Memory (LSTM) network which is a Recurrent Neural Network (RNN) that is trained using Backpropagation across time. Architecture of LSTM is given below.
Figure 2 shows AMD close price (actual) in green and close price (predicted) in brown, on testing data, predicted from LSTM approach. The developed model is capable of providing only one prediction into the future. Hence, we used the predicted value for forecasting next day close price. Similarly, closing prices for next three days were obtained. Even though predicted and actual close prices in testing data are very close, when we observed next three days predictions, they deviated from actual close prices considerably. We observed that the reason for this deviation is due to accounting first prediction value obtained from the model which appears to be a significant deviation from the actual price.
Further, our aim was to predict next three days or five days directly from a model. Hence, we experimented with another approach, namely, Encoder Decoder LSTM. Encoder Decoder LSTM is especially designed to address sequence to sequence problem.
Let’s understand the concept behind this approach.
Figure 3 illustrates the architecture of Encoder Decoder LSTM model.
Following describes the concepts and how we implemented the Encoder Decoder LSTM model for AMD Stock market data in Python with Keras.
Initially, we need following libraries for developing Encoder Decoder LSTM model with Keras.
Code Snippet 01 (Imports)
Suppose we have a 550 size time series data set. Before developing a machine learning time series model, we need to divide the data set into three parts as train, validation and test. To build the model, 500 records are used as train data, 25 as validation data (used to tune hyper parameters in the model) and 25 records as test data (to check the performance of the final developed model).
In Encoder Decoder LSTM Approach, train data set is reframed as below.
As depicted above, input sequence size is 20 while output sequence size is 5. The input and output sizes can be changed according to problem specification. Here, we have used 20 days data to input sequence which includes data for a month while for output we used 5 days data which includes data for a week.
Accordingly, in the training data set, data for the first 20 days are taken as input and data for the next 5 days starting from 21st day are taken as output. After that, data from 2nd day to 21st day are taken for next input and data for the next five days starting from 22nd day are taken as its output.
Similarly, input sequences and their corresponding output sequences are arranged until the entire training data set is utilized. Here, output size is less than input size and output size is generally decided based on prediction size (How long into the future do you wish the model to predict in a single prediction). Also, the LSTM model requires three-dimensional input with the shape including samples, time steps and features; where time steps being the input size (i.e., 20) and number of features being 1.
When defining the data for developing the model, input sequences and their corresponding output sequences are defined in a two dimensional array.
Code Snippet 02
Since the data is defined in a two dimensional array, we need to reshape it into three dimensional input as shown in the following code.
Code Snippet 03
Similarly, output should be reshaped since it is required for Encoder Decoder model.
Now let’s see how the model is defined. According to Architecture of Encoder Decoder LSTM model (Figure 3), second step is to define a LSTM Encoder model. This model is defined to read and encode the above defined input X. Then, encoded input sequence is repeated according to the defined time steps for the provided output time steps required by the model. In our scenario, time steps are defined as five. This is carried out by Repeated Vector Layer. After that, these are moved into decoder LSTM layer.
In Figure 3, after decoder LSTM layer, a Dense Layer is indicated. It is used as output layer which is wrapped in a TimeDistributed Layer and it provides one output for each step in the output sequence. Also, Adam Stochastic Gradient Descent is used in the model.
Following is the code relevant to define the encoder decoder LSTM model as explained above.
Code Snippet 04
After defining the model, it can be fitted with the training data and validation data is used for tuning the hyper parameters (i.e., batch size, etc). Also, Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) were calculated at each iteration and average RMSE and average.
MAE were used in finding the best model. After finding a model with optimal hyper parameters, test data was used for measuring the model performance.
Code Snippet 05
Following graph illustrates how Encoder Decoder LSTM model performs on each epoch.
According to Figure 4 graph, the lowest RMSE value and the lowest MAE was provided when running the model under 10 epochs. Hence, the model with 10 epochs is used as most appropriate model. After running this model on test data, it provided a 0.3404 RMSE value and a 0.2667 MAE value.
Also below time series graph (Figure 5) depicts how the original close price for last 2 years (from November 2017 – January 2019) behaves and how actual close prices and predicted values vary within the period of 07-01-2019 to 11-01 2019 and the next five days (unseen future) predictions from 14-01-2019. In addition, Figure 6 clearly illustrates how the above mentioned actual close prices and the predicted close prices for the test data vary over the considered period and predicted next five days close prices separately.
When considering this Encoder Decoder LSTM model development, it is designed to address the sequence to sequence prediction problem which requires a sequence as the input and a prediction sequence as the output. Hence, selecting appropriate lengths for input and output sequences can appear to be a challenge. This type of problem is referred to as a many to many type sequence prediction problem. According to scenario, the model can be reframed as a sequence of one input to a sequence of one output or as a sequence of multiple inputs to a sequence of one output. These types of problems are referred to as one to one and many to one sequence prediction problems respectively. Also, to implement the encoder model as well as the decoder model, one or more LSTM layers can be used. Size of the output vector in this model is fixed. But lengths of input and output sequences can be different.
After finding most appropriate model for AMD Stock Market data under Encoder Decoder LSTM approach, our next aim is to improve this model further using Net Cross Validation Method. As we need to experiment on how the model performs for AMD Stock market data when input sequences used are gradually increased. In the upcoming article, we will discuss the Encoder Decoder LSTM approach with Net Cross Validation Method.
Geemini I. Kulawadana
Indula holds a B.Sc. (Hons) Degree in Physical Science and a M.Sc. in Applied Statistics from the University of Colombo, Sri Lanka. While being the recipient of multiple gold medals during her post graduate degree in Applied Statistics, she has one research publication in EPH International Journal and abstract publications at International Conferences.