AI金融：LSTM預測股票

長短記憶神經網絡——通常稱作LSTM，是一種特殊的RNN，能夠學習長的依賴關系。他們由Hochreiter＆Schmidhuber引入，並被許多人進行了改進和普及。他們在各種各樣的問題上工作的非常好，現在被廣泛使用。

LSTM是為了避免長依賴問題而精心設計的。記住較長的歷史信息實際上是他們的默認行為，而不是他們努力學習的東西。

所有循環神經網絡都具有神經網絡的重復模塊鏈的形式。在標准的RNN中，該重復模塊將具有非常簡單的結構，例如單個tanh層。

標准RNN中的重復模塊的單層神經網絡

LSTM也擁有這種鏈狀結構，但是重復模塊則擁有不同的結構。與神經網絡的簡單的一層相比，LSTM擁有四層，這四層以特殊的方式進行交互。

LSTM中的重復模塊包含的四層交互神經網絡層

LSTM背后的核心理念

LSTM的關鍵是細胞狀態，表示細胞狀態的這條線水平的穿過圖的頂部。

細胞的狀態類似於輸送帶，細胞的狀態在整個鏈上運行，只有一些小的線性操作作用其上，信息很容易保持不變的流過整個鏈。

LSTM確實具有刪除或添加信息到細胞狀態的能力，這個能力是由被稱為門(Gate)的結構所賦予的。

門(Gate)是一種可選地讓信息通過的方式。它由一個Sigmoid神經網絡層和一個點乘法運算組成。

Sigmoid神經網絡層輸出0和1之間的數字，這個數字描述每個組件有多少信息可以通過， 0表示不通過任何信息，1表示全部通過

LSTM有三個門，用於保護和控制細胞的狀態。

一步步的拆解LSTM

LSTM的第一步是決定我們要從細胞狀態中丟棄什么信息。該決定由被稱為“忘記門”的Sigmoid層實現。它查看ht-1(前一個輸出)和xt(當前輸入)，並為單元格狀態Ct-1(上一個狀態)中的每個數字輸出0和1之間的數字。1代表完全保留，而0代表徹底刪除。

讓我們回到語言模型的例子，試圖根據以前的語料來預測下一個單詞。在這樣的問題中，細胞狀態可能包括當前主題的性別，從而決定使用正確的代詞。當我們看到一個新主題時，我們想要忘記舊主題的性別。

下一步是決定我們要在細胞狀態中存儲什么信息。這部分分為兩步。首先，稱為“輸入門層”的Sigmoid層決定了我們將更新哪些值。接下來一個tanh層創建候選向量Ct,該向量將會被加到細胞的狀態中。在下一步中，我們將結合這兩個向量來創建更新值。

在我們的語言模型的例子中，我們希望將新主題的性別添加到單元格狀態，以替換我們忘記的舊對象。

現在是時候去更新上一個狀態值Ct−1了，將其更新為Ct。簽名的步驟以及決定了應該做什么，我們只需實際執行即可。

我們將上一個狀態值乘以ft，以此表達期待忘記的部分。之后我們將得到的值加上 it∗C̃ t。這個得到的是新的候選值，按照我們決定更新每個狀態值的多少來衡量.

在語言模型的例子中，對應着實際刪除關於舊主題性別的信息，並添加新信息，正如在之前的步驟中描述的那樣。

最后，我們需要決定我們要輸出什么。此輸出將基於我們的細胞狀態，但將是一個過濾版本。首先，我們運行一個sigmoid層，它決定了我們要輸出的細胞狀態的哪些部分。然后，我們將單元格狀態通過tanh（將值規范化到-1和1之間），並將其乘以Sigmoid門的輸出，至此我們只輸出了我們決定的那些部分。

對於語言模型的例子，由於只看到一個主題，考慮到后面可能出現的詞，它可能需要輸出與動詞相關的信息。例如，它可能會輸出主題是單數還是復數，以便我們知道動詞應該如何組合在一起。

import math
import numpy as np
import pandas as pd

class DataLoader():
"""A class for loading and transforming data for the lstm model"""

def __init__(self, filename, split, cols):
dataframe = pd.read_csv(filename)
i_split = int(len(dataframe) * split)
self.data_train = dataframe.get(cols).values[:i_split]
self.data_test = dataframe.get(cols).values[i_split:]
self.len_train = len(self.data_train)
self.len_test = len(self.data_test)
self.len_train_windows = None

def get_test_data(self, seq_len, normalise):
'''
Create x, y test data windows
Warning: batch method, not generative, make sure you have enough memory to
load data, otherwise reduce size of the training split.
'''
data_windows = []
for i in range(self.len_test - seq_len):
data_windows.append(self.data_test[i:i+seq_len])

data_windows = np.array(data_windows).astype(float)
data_windows = self.normalise_windows(data_windows, single_window=False) if normalise else data_windows

x = data_windows[:, :-1]
y = data_windows[:, -1, [0]]
return x,y

def get_train_data(self, seq_len, normalise):
'''
Create x, y train data windows
Warning: batch method, not generative, make sure you have enough memory to
load data, otherwise use generate_training_window() method.
'''
data_x = []
data_y = []
for i in range(self.len_train - seq_len):
x, y = self._next_window(i, seq_len, normalise)
data_x.append(x)
data_y.append(y)
return np.array(data_x), np.array(data_y)

def generate_train_batch(self, seq_len, batch_size, normalise):
'''Yield a generator of training data from filename on given list of cols split for train/test'''
i = 0
while i < (self.len_train - seq_len):
x_batch = []
y_batch = []
for b in range(batch_size):
if i >= (self.len_train - seq_len):
# stop-condition for a smaller final batch if data doesn't divide evenly
yield np.array(x_batch), np.array(y_batch)
i = 0
x, y = self._next_window(i, seq_len, normalise)
x_batch.append(x)
y_batch.append(y)
i += 1
yield np.array(x_batch), np.array(y_batch)

def _next_window(self, i, seq_len, normalise):
'''Generates the next data window from the given index location i'''
window = self.data_train[i:i+seq_len]
window = self.normalise_windows(window, single_window=True)[0] if normalise else window
x = window[:-1]
y = window[-1, [0]]
return x, y

def normalise_windows(self, window_data, single_window=False):
'''Normalise window with a base value of zero'''
normalised_data = []
window_data = [window_data] if single_window else window_data
for window in window_data:
normalised_window = []
for col_i in range(window.shape[1]):
normalised_col = [((float(p) / float(window[0, col_i])) - 1) for p in window[:, col_i]]
normalised_window.append(normalised_col)
normalised_window = np.array(normalised_window).T # reshape and transpose array back into original multidimensional format
normalised_data.append(normalised_window)
return np.array(normalised_data)

import os
import math
import numpy as np
import datetime as dt
from numpy import newaxis
from core.utils import Timer
from keras.layers import Dense, Activation, Dropout, LSTM
from keras.models import Sequential, load_model
from keras.callbacks import EarlyStopping, ModelCheckpoint

class Model():
"""A class for an building and inferencing an lstm model"""

def __init__(self):
self.model = Sequential()

def load_model(self, filepath):
print('[Model] Loading model from file %s' % filepath)
self.model = load_model(filepath)

def build_model(self, configs):
timer = Timer()
timer.start()

for layer in configs['model']['layers']:
neurons = layer['neurons'] if 'neurons' in layer else None
dropout_rate = layer['rate'] if 'rate' in layer else None
activation = layer['activation'] if 'activation' in layer else None
return_seq = layer['return_seq'] if 'return_seq' in layer else None
input_timesteps = layer['input_timesteps'] if 'input_timesteps' in layer else None
input_dim = layer['input_dim'] if 'input_dim' in layer else None

if layer['type'] == 'dense':
self.model.add(Dense(neurons, activation=activation))
if layer['type'] == 'lstm':
self.model.add(LSTM(neurons, input_shape=(input_timesteps, input_dim), return_sequences=return_seq))
if layer['type'] == 'dropout':
self.model.add(Dropout(dropout_rate))

self.model.compile(loss=configs['model']['loss'], optimizer=configs['model']['optimizer'])

print('[Model] Model Compiled')
timer.stop()

def train(self, x, y, epochs, batch_size, save_dir):
timer = Timer()
timer.start()
print('[Model] Training Started')
print('[Model] %s epochs, %s batch size' % (epochs, batch_size))

save_fname = os.path.join(save_dir, '%s-e%s.h5' % (dt.datetime.now().strftime('%d%m%Y-%H%M%S'), str(epochs)))
callbacks = [
EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath=save_fname, monitor='val_loss', save_best_only=True)
]
self.model.fit(
x,
y,
epochs=epochs,
batch_size=batch_size,
callbacks=callbacks
)
self.model.save(save_fname)

print('[Model] Training Completed. Model saved as %s' % save_fname)
timer.stop()

def train_generator(self, data_gen, epochs, batch_size, steps_per_epoch, save_dir):
timer = Timer()
timer.start()
print('[Model] Training Started')
print('[Model] %s epochs, %s batch size, %s batches per epoch' % (epochs, batch_size, steps_per_epoch))

save_fname = os.path.join(save_dir, '%s-e%s.h5' % (dt.datetime.now().strftime('%d%m%Y-%H%M%S'), str(epochs)))
callbacks = [
ModelCheckpoint(filepath=save_fname, monitor='loss', save_best_only=True)
]
self.model.fit_generator(
data_gen,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
callbacks=callbacks,
workers=1
)

print('[Model] Training Completed. Model saved as %s' % save_fname)
timer.stop()

def predict_point_by_point(self, data):
#Predict each timestep given the last sequence of true data, in effect only predicting 1 step ahead each time
print('[Model] Predicting Point-by-Point...')
predicted = self.model.predict(data)
predicted = np.reshape(predicted, (predicted.size,))
return predicted

def predict_sequences_multiple(self, data, window_size, prediction_len):
#Predict sequence of 50 steps before shifting prediction run forward by 50 steps
print('[Model] Predicting Sequences Multiple...')
prediction_seqs = []
for i in range(int(len(data)/prediction_len)):
curr_frame = data[i*prediction_len]
predicted = []
for j in range(prediction_len):
predicted.append(self.model.predict(curr_frame[newaxis,:,:])[0,0])
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
prediction_seqs.append(predicted)
return prediction_seqs

def predict_sequence_full(self, data, window_size):
#Shift the window by 1 new prediction each time, re-run predictions on new window
print('[Model] Predicting Sequences Full...')
curr_frame = data[0]
predicted = []
for i in range(len(data)):
predicted.append(self.model.predict(curr_frame[newaxis,:,:])[0,0])
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-2], predicted[-1], axis=0)
return predicted

import os
import json
import time
import math
import matplotlib.pyplot as plt
from core.data_processor import DataLoader
from core.model import Model

def plot_results(predicted_data, true_data):
fig = plt.figure(facecolor='white')
ax = fig.add_subplot(111)
ax.plot(true_data, label='True Data')
plt.plot(predicted_data, label='Prediction')
plt.legend()
plt.show()

def plot_results_multiple(predicted_data, true_data, prediction_len):
fig = plt.figure(facecolor='white')
ax = fig.add_subplot(111)
ax.plot(true_data, label='True Data')
# Pad the list of predictions to shift it in the graph to it's correct start
for i, data in enumerate(predicted_data):
padding = [None for p in range(i * prediction_len)]
plt.plot(padding + data, label='Prediction')
plt.legend()
plt.show()

def main():
configs = json.load(open('config.json', 'r'))
if not os.path.exists(configs['model']['save_dir']): os.makedirs(configs['model']['save_dir'])

data = DataLoader(
os.path.join('data', configs['data']['filename']),
configs['data']['train_test_split'],
configs['data']['columns']
)

model = Model()
model.build_model(configs)
x, y = data.get_train_data(
seq_len=configs['data']['sequence_length'],
normalise=configs['data']['normalise']
)

'''
# in-memory training
model.train(
x,
y,
epochs = configs['training']['epochs'],
batch_size = configs['training']['batch_size'],
save_dir = configs['model']['save_dir']
)
'''
# out-of memory generative training
steps_per_epoch = math.ceil((data.len_train - configs['data']['sequence_length']) / configs['training']['batch_size'])
model.train_generator(
data_gen=data.generate_train_batch(
seq_len=configs['data']['sequence_length'],
batch_size=configs['training']['batch_size'],
normalise=configs['data']['normalise']
),
epochs=configs['training']['epochs'],
batch_size=configs['training']['batch_size'],
steps_per_epoch=steps_per_epoch,
save_dir=configs['model']['save_dir']
)

x_test, y_test = data.get_test_data(
seq_len=configs['data']['sequence_length'],
normalise=configs['data']['normalise']
)

predictions = model.predict_sequences_multiple(x_test, configs['data']['sequence_length'], configs['data']['sequence_length'])
# predictions = model.predict_sequence_full(x_test, configs['data']['sequence_length'])
# predictions = model.predict_point_by_point(x_test)

plot_results_multiple(predictions, y_test, configs['data']['sequence_length'])
# plot_results(predictions, y_test)

if __name__ == '__main__':
main()