Stock Price Prediction of Apple with PyTorch

13 minute read

LSTM and GRU

Time Series

Machine Learning’s captivating domain of time series forecasting commands attention, offering potentially significant benefits when integrated with advanced subjects like stock price prediction. Essentially, time series forecasting employs a specific model to anticipate future data points by utilizing the patterns found in previously observed value.

A time series, by definition, is a sequence of data points arranged in chronological order. This kind of problem holds significant relevance because numerous prediction challenges incorporate a temporal aspect. Uncovering the relationship between data and time is crucial to these analyses, such as weather forecasting or earthquake prediction. However, these issues are occasionally overlooked due to the non-trivial complexities involved in modeling these temporal relationships.

Stock market prediction entails efforts to estimate the future value of a company’s stock. The accurate prognostication of a stock’s future price can result in substantial gains, fitting this scenario into the realm of time series problems.

Over time, numerous methods have been developed to predict stock prices accurately, given their volatile and complex fluctuations. Neural networks, particularly Recurrent Neural Networks (RNNs), have exhibited significant applicability in this field. In this context, we will construct two distinct RNN models — Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) — using PyTorch. We aim to forecast Apples’s stock market price and compare these models’ performance in aspects of time and efficiency.

Recurrent Neural Network (RNN)

A Recurrent Neural Network (RNN) is a particular breed of artificial neural network crafted to discern patterns in sequential data to anticipate ensuing events. The power of this architecture lies in its interconnected nodes, enabling it to demonstrate dynamic behavior over time. Another notable attribute of this structure is the utilization of feedback loops for sequence processing. This feature facilitates the persistence of information, often likened to memory, rendering RNNs ideal for Natural Language Processing (NLP) and time series problems. This foundational structure gave rise to advanced architectures such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU).

An LSTM unit comprises a cell and three gates: an input gate, an output gate, and a forget gate. The cell retains values over arbitrary time intervals, while the trio of gates control the influx and efflux of information from the cell.

Alt text

Conversely, a GRU possesses fewer parameters compared to an LSTM as it lacks an output gate. However, both configurations are capable of resolving the “short-term memory” problem typically associated with basic RNNs, and successfully maintain long-term correlations in sequential data.

Alt text

While LSTM enjoys greater popularity at present, it’s anticipated that GRU will ultimately surpass it due to its enhanced speed while maintaining comparable accuracy and effectiveness. It’s likely that we’ll observe a similar outcome in this case, with the GRU model demonstrating superior performance under these conditions.

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('./archive'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

./archive/JPM_2006-01-01_to_2018-01-01.csv
./archive/MSFT_2006-01-01_to_2018-01-01.csv
./archive/JNJ_2006-01-01_to_2018-01-01.csv
./archive/UNH_2006-01-01_to_2018-01-01.csv
./archive/CAT_2006-01-01_to_2018-01-01.csv
./archive/AABA_2006-01-01_to_2018-01-01.csv
./archive/HD_2006-01-01_to_2018-01-01.csv
./archive/CVX_2006-01-01_to_2018-01-01.csv
./archive/MMM_2006-01-01_to_2018-01-01.csv
./archive/AMZN_2006-01-01_to_2018-01-01.csv
./archive/CSCO_2006-01-01_to_2018-01-01.csv
./archive/XOM_2006-01-01_to_2018-01-01.csv
./archive/all_stocks_2017-01-01_to_2018-01-01.csv
./archive/VZ_2006-01-01_to_2018-01-01.csv
./archive/WMT_2006-01-01_to_2018-01-01.csv
./archive/GS_2006-01-01_to_2018-01-01.csv
./archive/AAPL_2006-01-01_to_2018-01-01.csv
./archive/AXP_2006-01-01_to_2018-01-01.csv
./archive/all_stocks_2006-01-01_to_2018-01-01.csv
./archive/GOOGL_2006-01-01_to_2018-01-01.csv
./archive/UTX_2006-01-01_to_2018-01-01.csv
./archive/KO_2006-01-01_to_2018-01-01.csv
./archive/MRK_2006-01-01_to_2018-01-01.csv
./archive/TRV_2006-01-01_to_2018-01-01.csv
./archive/IBM_2006-01-01_to_2018-01-01.csv
./archive/INTC_2006-01-01_to_2018-01-01.csv
./archive/PFE_2006-01-01_to_2018-01-01.csv
./archive/GE_2006-01-01_to_2018-01-01.csv
./archive/DIS_2006-01-01_to_2018-01-01.csv
./archive/PG_2006-01-01_to_2018-01-01.csv
./archive/BA_2006-01-01_to_2018-01-01.csv
./archive/MCD_2006-01-01_to_2018-01-01.csv
./archive/NKE_2006-01-01_to_2018-01-01.csv

Implementation

The dataset contains historical stock prices. We are going to predict the Close price of the stock, and the following is the data behavior over the years.

filepath = './archive/AAPL_2006-01-01_to_2018-01-01.csv'
data = pd.read_csv(filepath)
data = data.sort_values('Date')
data.head()
Date Open High Low Close Volume Name
0 2006-01-03 10.34 10.68 10.32 10.68 201853036 AAPL
1 2006-01-04 10.73 10.85 10.64 10.71 155225609 AAPL
2 2006-01-05 10.69 10.70 10.54 10.63 112396081 AAPL
3 2006-01-06 10.75 10.96 10.65 10.90 176139334 AAPL
4 2006-01-09 10.96 11.03 10.82 10.86 168861224 AAPL

import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("darkgrid")
plt.figure(figsize = (15,9))
plt.plot(data[['Close']])
plt.xticks(range(0,data.shape[0],500),data['Date'].loc[::500],rotation=45)
plt.title("Apple Stock Price",fontsize=18, fontweight='bold')
plt.xlabel('Date',fontsize=18)
plt.ylabel('Close Price (USD)',fontsize=18)
plt.show()
price = data[['Close']]
price.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3019 entries, 0 to 3018
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Close   3019 non-null   float64
dtypes: float64(1)
memory usage: 47.2 KB

I slice the data frame to get the column we want and normalize the data.

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(-1, 1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1,1))
/var/folders/8_/k8h_gshs3_qdkr8glv1ff08h0000gn/T/ipykernel_43516/68737012.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1,1))

We’re now ready to partition the data into training and test sets. But prior to that, it’s necessary to determine the width of the analysis window. This technique of employing previous time steps to forecast the subsequent time step is referred to as the sliding window approach.

def split_data(stock, lookback):
    data_raw = stock.to_numpy() # convert to numpy array
    data = []
    
    # create all possible sequences of length seq_len
    for index in range(len(data_raw) - lookback): 
        data.append(data_raw[index: index + lookback])
    
    data = np.array(data);
    test_set_size = int(np.round(0.2*data.shape[0]));
    train_set_size = data.shape[0] - (test_set_size);
    
    x_train = data[:train_set_size,:-1,:]
    y_train = data[:train_set_size,-1,:]
    
    x_test = data[train_set_size:,:-1]
    y_test = data[train_set_size:,-1,:]
    
    return [x_train, y_train, x_test, y_test]
lookback = 20 # choose sequence length
x_train, y_train, x_test, y_test = split_data(price, lookback)
print('x_train.shape = ',x_train.shape)
print('y_train.shape = ',y_train.shape)
print('x_test.shape = ',x_test.shape)
print('y_test.shape = ',y_test.shape)
x_train.shape =  (2399, 19, 1)
y_train.shape =  (2399, 1)
x_test.shape =  (600, 19, 1)
y_test.shape =  (600, 1)

Next, we convert them into tensors, the foundational data structure required for constructing a model in PyTorch.

import torch
import torch.nn as nn

x_train = torch.from_numpy(x_train).type(torch.Tensor)
x_test = torch.from_numpy(x_test).type(torch.Tensor)
y_train_lstm = torch.from_numpy(y_train).type(torch.Tensor)
y_test_lstm = torch.from_numpy(y_test).type(torch.Tensor)
y_train_gru = torch.from_numpy(y_train).type(torch.Tensor)
y_test_gru = torch.from_numpy(y_test).type(torch.Tensor)
input_dim = 1
hidden_dim = 32
num_layers = 2
output_dim = 1
num_epochs = 100

LSTM

class LSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTM, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[:, -1, :]) 
        return out
model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.MSELoss(reduction='mean')
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)
import time

hist = np.zeros(num_epochs)
start_time = time.time()
lstm = []

for t in range(num_epochs):
    y_train_pred = model(x_train)

    loss = criterion(y_train_pred, y_train_lstm)
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

    optimiser.zero_grad()
    loss.backward()
    optimiser.step()
    
training_time = time.time()-start_time
print("Training time: {}".format(training_time))
Epoch  0 MSE:  0.25819653272628784
Epoch  1 MSE:  0.15166231989860535
Epoch  2 MSE:  0.20903076231479645
Epoch  3 MSE:  0.13991586863994598
Epoch  4 MSE:  0.1287676990032196
Epoch  5 MSE:  0.13563665747642517
Epoch  6 MSE:  0.13573406636714935
Epoch  7 MSE:  0.12434989213943481
Epoch  8 MSE:  0.1031389907002449
Epoch  9 MSE:  0.0779111385345459
Epoch  10 MSE:  0.06311900913715363
Epoch  11 MSE:  0.07091287523508072
Epoch  12 MSE:  0.052488457411527634
Epoch  13 MSE:  0.023740172386169434
Epoch  14 MSE:  0.01886197179555893
Epoch  15 MSE:  0.023339685052633286
Epoch  16 MSE:  0.017914501950144768
Epoch  17 MSE:  0.010528038255870342
Epoch  18 MSE:  0.020709635689854622
Epoch  19 MSE:  0.022142166271805763
Epoch  20 MSE:  0.00913378968834877
Epoch  21 MSE:  0.0032837125472724438
Epoch  22 MSE:  0.006005624774843454
Epoch  23 MSE:  0.009221922606229782
Epoch  24 MSE:  0.009343614801764488
Epoch  25 MSE:  0.007402882911264896
Epoch  26 MSE:  0.006014988292008638
Epoch  27 MSE:  0.006637887097895145
Epoch  28 MSE:  0.007978024892508984
Epoch  29 MSE:  0.007520338520407677
Epoch  30 MSE:  0.005026531405746937
Epoch  31 MSE:  0.002815255429595709
Epoch  32 MSE:  0.002627915469929576
Epoch  33 MSE:  0.0038487249985337257
Epoch  34 MSE:  0.004618681501597166
Epoch  35 MSE:  0.0039092665538191795
Epoch  36 MSE:  0.002484086435288191
Epoch  37 MSE:  0.0018542427569627762
Epoch  38 MSE:  0.002269477816298604
Epoch  39 MSE:  0.002432651352137327
Epoch  40 MSE:  0.0017461515963077545
Epoch  41 MSE:  0.001250344910658896
Epoch  42 MSE:  0.0016412724507972598
Epoch  43 MSE:  0.0022176974453032017
Epoch  44 MSE:  0.002139537362381816
Epoch  45 MSE:  0.0015903041930869222
Epoch  46 MSE:  0.0013555067125707865
Epoch  47 MSE:  0.0015729618025943637
Epoch  48 MSE:  0.0015560296596959233
Epoch  49 MSE:  0.0010814378038048744
Epoch  50 MSE:  0.0007583849364891648
Epoch  51 MSE:  0.0008783523226156831
Epoch  52 MSE:  0.0010345984483137727
Epoch  53 MSE:  0.0009140677284449339
Epoch  54 MSE:  0.0007315980619750917
Epoch  55 MSE:  0.0007604123093187809
Epoch  56 MSE:  0.0008461487013846636
Epoch  57 MSE:  0.0007311741355806589
Epoch  58 MSE:  0.0005497800884768367
Epoch  59 MSE:  0.0005577158881351352
Epoch  60 MSE:  0.0006879133288748562
Epoch  61 MSE:  0.0007144041010178626
Epoch  62 MSE:  0.0006207934347912669
Epoch  63 MSE:  0.0005688412929885089
Epoch  64 MSE:  0.0005930980551056564
Epoch  65 MSE:  0.0005647227517329156
Epoch  66 MSE:  0.00047004505177028477
Epoch  67 MSE:  0.0004422623314894736
Epoch  68 MSE:  0.0004969367873854935
Epoch  69 MSE:  0.0005126534379087389
Epoch  70 MSE:  0.00046223439858295023
Epoch  71 MSE:  0.00043931364780291915
Epoch  72 MSE:  0.0004609820316545665
Epoch  73 MSE:  0.00045188714284449816
Epoch  74 MSE:  0.0004201307019684464
Epoch  75 MSE:  0.00043058270239271224
Epoch  76 MSE:  0.0004576134087983519
Epoch  77 MSE:  0.0004446248640306294
Epoch  78 MSE:  0.0004175296053290367
Epoch  79 MSE:  0.00041787829832173884
Epoch  80 MSE:  0.00041931969462893903
Epoch  81 MSE:  0.0004008542455267161
Epoch  82 MSE:  0.0003951751277782023
Epoch  83 MSE:  0.0004096981429029256
Epoch  84 MSE:  0.00041126698488369584
Epoch  85 MSE:  0.0003979630710091442
Epoch  86 MSE:  0.0003955823485739529
Epoch  87 MSE:  0.0004000987682957202
Epoch  88 MSE:  0.0003947264631278813
Epoch  89 MSE:  0.00039027887396514416
Epoch  90 MSE:  0.00039583834586665034
Epoch  91 MSE:  0.00039670622209087014
Epoch  92 MSE:  0.0003883809840772301
Epoch  93 MSE:  0.00038467306876555085
Epoch  94 MSE:  0.00038626548484899104
Epoch  95 MSE:  0.00038355711149051785
Epoch  96 MSE:  0.0003808493784163147
Epoch  97 MSE:  0.00038343065534718335
Epoch  98 MSE:  0.0003837483818642795
Epoch  99 MSE:  0.00037956441519781947
Training time: 8.352406024932861
predict = pd.DataFrame(scaler.inverse_transform(y_train_pred.detach().numpy()))
original = pd.DataFrame(scaler.inverse_transform(y_train_lstm.detach().numpy()))
import seaborn as sns
sns.set_style("darkgrid")    

fig = plt.figure()
fig.subplots_adjust(hspace=0.2, wspace=0.2)

plt.subplot(1, 2, 1)
ax = sns.lineplot(x = original.index, y = original[0], label="Data", color='royalblue')
ax = sns.lineplot(x = predict.index, y = predict[0], label="Training Prediction (LSTM)", color='tomato')
ax.set_title('Stock price', size = 14, fontweight='bold')
ax.set_xlabel("Days", size = 14)
ax.set_ylabel("Cost (USD)", size = 14)
ax.set_xticklabels('', size=10)


plt.subplot(1, 2, 2)
ax = sns.lineplot(data=hist, color='royalblue')
ax.set_xlabel("Epoch", size = 14)
ax.set_ylabel("Loss", size = 14)
ax.set_title("Training Loss", size = 14, fontweight='bold')
fig.set_figheight(6)
fig.set_figwidth(16)

import math, time
from sklearn.metrics import mean_squared_error

# make predictions
y_test_pred = model(x_test)

# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_lstm.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_lstm.detach().numpy())

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
lstm.append(trainScore)
lstm.append(testScore)
lstm.append(training_time)
Train Score: 1.65 RMSE
Test Score: 5.56 RMSE
# shift train predictions for plotting
trainPredictPlot = np.empty_like(price)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[lookback:len(y_train_pred)+lookback, :] = y_train_pred

# shift test predictions for plotting
testPredictPlot = np.empty_like(price)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(y_train_pred)+lookback-1:len(price)-1, :] = y_test_pred

original = scaler.inverse_transform(price['Close'].values.reshape(-1,1))

predictions = np.append(trainPredictPlot, testPredictPlot, axis=1)
predictions = np.append(predictions, original, axis=1)
result = pd.DataFrame(predictions)
import plotly.express as px
import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[0],
                    mode='lines',
                    name='Train prediction')))
fig.add_trace(go.Scatter(x=result.index, y=result[1],
                    mode='lines',
                    name='Test prediction'))
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[2],
                    mode='lines',
                    name='Actual Value')))
fig.update_layout(
    xaxis=dict(
        showline=True,
        showgrid=True,
        showticklabels=False,
        linecolor='white',
        linewidth=2
    ),
    yaxis=dict(
        title_text='Close (USD)',
        titlefont=dict(
            family='Rockwell',
            size=12,
            color='white',
        ),
        showline=True,
        showgrid=True,
        showticklabels=True,
        linecolor='white',
        linewidth=2,
        ticks='outside',
        tickfont=dict(
            family='Rockwell',
            size=12,
            color='white',
        ),
    ),
    showlegend=True,
    template = 'plotly_dark'

)



annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,
                              xanchor='left', yanchor='bottom',
                              text='Results (LSTM)',
                              font=dict(family='Rockwell',
                                        size=26,
                                        color='white'),
                              showarrow=False))
fig.update_layout(annotations=annotations)

fig.show()

The model behaves well with the training set, and it also has solid performace with the test set. The model is probably overfitting, especially taking into consideration that the loss is minimal after the 40th epoch.

GRU

class GRU(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(GRU, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        
        self.gru = nn.GRU(input_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
        out, (hn) = self.gru(x, (h0.detach()))
        out = self.fc(out[:, -1, :]) 
        return out
model = GRU(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
criterion = torch.nn.MSELoss(reduction='mean')
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)
hist = np.zeros(num_epochs)
start_time = time.time()
gru = []

for t in range(num_epochs):
    y_train_pred = model(x_train)

    loss = criterion(y_train_pred, y_train_gru)
    print("Epoch ", t, "MSE: ", loss.item())
    hist[t] = loss.item()

    optimiser.zero_grad()
    loss.backward()
    optimiser.step()

training_time = time.time()-start_time    
print("Training time: {}".format(training_time))
Epoch  0 MSE:  0.3607046902179718
Epoch  1 MSE:  0.13102510571479797
Epoch  2 MSE:  0.19682788848876953
Epoch  3 MSE:  0.12917079031467438
Epoch  4 MSE:  0.075187087059021
Epoch  5 MSE:  0.07155207544565201
Epoch  6 MSE:  0.06979498267173767
Epoch  7 MSE:  0.045164529234170914
Epoch  8 MSE:  0.010227248072624207
Epoch  9 MSE:  0.007098929025232792
Epoch  10 MSE:  0.03696290776133537
Epoch  11 MSE:  0.023822534829378128
Epoch  12 MSE:  0.007158784195780754
Epoch  13 MSE:  0.011737179011106491
Epoch  14 MSE:  0.018340380862355232
Epoch  15 MSE:  0.01571972481906414
Epoch  16 MSE:  0.007737305015325546
Epoch  17 MSE:  0.0022043471690267324
Epoch  18 MSE:  0.003535511204972863
Epoch  19 MSE:  0.009062006138265133
Epoch  20 MSE:  0.011892840266227722
Epoch  21 MSE:  0.009278996847569942
Epoch  22 MSE:  0.004890399053692818
Epoch  23 MSE:  0.00290724472142756
Epoch  24 MSE:  0.0037367662880569696
Epoch  25 MSE:  0.005075107794255018
Epoch  26 MSE:  0.0048552630469202995
Epoch  27 MSE:  0.0029767893720418215
Epoch  28 MSE:  0.0011302019702270627
Epoch  29 MSE:  0.0010998218785971403
Epoch  30 MSE:  0.0027486730832606554
Epoch  31 MSE:  0.003973710350692272
Epoch  32 MSE:  0.0033208823297172785
Epoch  33 MSE:  0.001725937006995082
Epoch  34 MSE:  0.0009095442364923656
Epoch  35 MSE:  0.0012437441619113088
Epoch  36 MSE:  0.001808099914342165
Epoch  37 MSE:  0.001749989576637745
Epoch  38 MSE:  0.0011168664786964655
Epoch  39 MSE:  0.0006078396691009402
Epoch  40 MSE:  0.0007392823463305831
Epoch  41 MSE:  0.0012739860685542226
Epoch  42 MSE:  0.001527499407529831
Epoch  43 MSE:  0.0012058777501806617
Epoch  44 MSE:  0.0006985657382756472
Epoch  45 MSE:  0.0005090595805086195
Epoch  46 MSE:  0.0006709578447043896
Epoch  47 MSE:  0.0008276336011476815
Epoch  48 MSE:  0.0007188778254203498
Epoch  49 MSE:  0.00046144291991367936
Epoch  50 MSE:  0.00036167059442959726
Epoch  51 MSE:  0.0005161706358194351
Epoch  52 MSE:  0.0006932021933607757
Epoch  53 MSE:  0.0006540967733599246
Epoch  54 MSE:  0.00046293693594634533
Epoch  55 MSE:  0.0003537530137691647
Epoch  56 MSE:  0.0004015856538899243
Epoch  57 MSE:  0.0004694756935350597
Epoch  58 MSE:  0.0004315198748372495
Epoch  59 MSE:  0.00033121410524472594
Epoch  60 MSE:  0.0002968600601889193
Epoch  61 MSE:  0.00035973641206510365
Epoch  62 MSE:  0.00042053856304846704
Epoch  63 MSE:  0.00039580956217832863
Epoch  64 MSE:  0.00032569453469477594
Epoch  65 MSE:  0.000298373750410974
Epoch  66 MSE:  0.0003244410618208349
Epoch  67 MSE:  0.00034042992047034204
Epoch  68 MSE:  0.00031064936774782836
Epoch  69 MSE:  0.0002729821135289967
Epoch  70 MSE:  0.00027573812985792756
Epoch  71 MSE:  0.0003066273347940296
Epoch  72 MSE:  0.0003163278743159026
Epoch  73 MSE:  0.0002932495262939483
Epoch  74 MSE:  0.00027294218307361007
Epoch  75 MSE:  0.0002778717316687107
Epoch  76 MSE:  0.0002882281842175871
Epoch  77 MSE:  0.0002800409565679729
Epoch  78 MSE:  0.0002621126768644899
Epoch  79 MSE:  0.0002582546148914844
Epoch  80 MSE:  0.000269515992840752
Epoch  81 MSE:  0.0002754448796622455
Epoch  82 MSE:  0.0002673306444194168
Epoch  83 MSE:  0.0002585184993222356
Epoch  84 MSE:  0.0002598936844151467
Epoch  85 MSE:  0.0002639705198816955
Epoch  86 MSE:  0.00026010669535025954
Epoch  87 MSE:  0.0002519670524634421
Epoch  88 MSE:  0.0002499922120478004
Epoch  89 MSE:  0.0002542092406656593
Epoch  90 MSE:  0.0002556447288952768
Epoch  91 MSE:  0.0002515747328288853
Epoch  92 MSE:  0.0002483536081854254
Epoch  93 MSE:  0.0002494044601917267
Epoch  94 MSE:  0.0002502511197235435
Epoch  95 MSE:  0.00024729460710659623
Epoch  96 MSE:  0.00024373934138566256
Epoch  97 MSE:  0.00024349824525415897
Epoch  98 MSE:  0.000244816328631714
Epoch  99 MSE:  0.0002440418174955994
Training time: 8.153863906860352
predict = pd.DataFrame(scaler.inverse_transform(y_train_pred.detach().numpy()))
original = pd.DataFrame(scaler.inverse_transform(y_train_gru.detach().numpy()))
import seaborn as sns
sns.set_style("darkgrid")    

fig = plt.figure()
fig.subplots_adjust(hspace=0.2, wspace=0.2)

plt.subplot(1, 2, 1)
ax = sns.lineplot(x = original.index, y = original[0], label="Data", color='royalblue')
ax = sns.lineplot(x = predict.index, y = predict[0], label="Training Prediction (GRU)", color='tomato')
ax.set_title('Stock price', size = 14, fontweight='bold')
ax.set_xlabel("Days", size = 14)
ax.set_ylabel("Cost (USD)", size = 14)
ax.set_xticklabels('', size=10)


plt.subplot(1, 2, 2)
ax = sns.lineplot(data=hist, color='royalblue')
ax.set_xlabel("Epoch", size = 14)
ax.set_ylabel("Loss", size = 14)
ax.set_title("Training Loss", size = 14, fontweight='bold')
fig.set_figheight(6)
fig.set_figwidth(16)

import math, time
from sklearn.metrics import mean_squared_error

# make predictions
y_test_pred = model(x_test)

# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_gru.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_gru.detach().numpy())

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
gru.append(trainScore)
gru.append(testScore)
gru.append(training_time)
Train Score: 1.32 RMSE
Test Score: 4.90 RMSE
# shift train predictions for plotting
trainPredictPlot = np.empty_like(price)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[lookback:len(y_train_pred)+lookback, :] = y_train_pred

# shift test predictions for plotting
testPredictPlot = np.empty_like(price)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(y_train_pred)+lookback-1:len(price)-1, :] = y_test_pred

original = scaler.inverse_transform(price['Close'].values.reshape(-1,1))

predictions = np.append(trainPredictPlot, testPredictPlot, axis=1)
predictions = np.append(predictions, original, axis=1)
result = pd.DataFrame(predictions)
import plotly.express as px
import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[0],
                    mode='lines',
                    name='Train prediction')))
fig.add_trace(go.Scatter(x=result.index, y=result[1],
                    mode='lines',
                    name='Test prediction'))
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[2],
                    mode='lines',
                    name='Actual Value')))
fig.update_layout(
    xaxis=dict(
        showline=True,
        showgrid=True,
        showticklabels=False,
        linecolor='white',
        linewidth=2
    ),
    yaxis=dict(
        title_text='Close (USD)',
        titlefont=dict(
            family='Rockwell',
            size=12,
            color='white',
        ),
        showline=True,
        showgrid=True,
        showticklabels=True,
        linecolor='white',
        linewidth=2,
        ticks='outside',
        tickfont=dict(
            family='Rockwell',
            size=12,
            color='white',
        ),
    ),
    showlegend=True,
    template = 'plotly_dark'

)



annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,
                              xanchor='left', yanchor='bottom',
                              text='Results (GRU)',
                              font=dict(family='Rockwell',
                                        size=26,
                                        color='white'),
                              showarrow=False))
fig.update_layout(annotations=annotations)

fig.show()
lstm = pd.DataFrame(lstm, columns=['LSTM'])
gru = pd.DataFrame(gru, columns=['GRU'])
result = pd.concat([lstm, gru], axis=1, join='inner')
result.index = ['Train RMSE', 'Test RMSE', 'Train Time']
result
LSTM GRU
Train RMSE 1.648017 1.321451
Test RMSE 5.562765 4.904847
Train Time 8.352406 8.153864

Conclusion

Both models demonstrate commendable performance during the training phase, but their progress seems to plateau around the 40th epoch, indicating that the predefined 100 epochs may not be necessary.

In line with our expectations, the GRU neural network excelled over the LSTM in terms of accuracy, achieving a lower mean square error (in both training and, crucially, in the test set) and faster processing speed.