LSTM java 實現

本文轉載自查看原文 2016-11-11 16:56 1652 lstm

由於實驗室事情緣故，需要將Python寫的神經網絡轉成Java版本的，但是python中的numpy等啥包也不知道在Java里面對應的是什么工具，所以索性直接尋找一個現成可用的Java神經網絡框架，於是就找到了JOONE，JOONE是一個神經網絡的開源框架，使用的是BP算法進行迭代計算參數，使用起來比較方便也比較實用，下面介紹一下JOONE的一些使用方法。

JOONE需要使用一些外部的依賴包，這在官方網站上有，也可以在這里下載。將所需的包引入工程之后，就可以進行編碼實現了。

首先看下完整的程序，這個是上面那個超鏈接給出的程序，應該是官方給出的一個示例吧，因為好多文章都用這個，這其實是神經網絡訓練一個異或計算器：

[java] view plain copy

import org.joone.engine.*;
import org.joone.engine.learning.*;
import org.joone.io.*;
import org.joone.net.*;
/*
*
* JOONE實現
*
* */
public class XOR_using_NeuralNet implements NeuralNetListener
{
private NeuralNet nnet = null;
private MemoryInputSynapse inputSynapse, desiredOutputSynapse;
LinearLayer input;
SigmoidLayer hidden, output;
boolean singleThreadMode = true;
// XOR input
private double[][] inputArray = new double[][]
{
{ 0.0, 0.0 },
{ 0.0, 1.0 },
{ 1.0, 0.0 },
{ 1.0, 1.0 } };
// XOR desired output
private double[][] desiredOutputArray = new double[][]
{
{ 0.0 },
{ 1.0 },
{ 1.0 },
{ 0.0 } };
/**
* @param args
* the command line arguments
*/
public static void main(String args[])
{
XOR_using_NeuralNet xor = new XOR_using_NeuralNet();
xor.initNeuralNet();
xor.train();
xor.interrogate();
}
/**
* Method declaration
*/
public void train()
{
// set the inputs
inputSynapse.setInputArray(inputArray);
inputSynapse.setAdvancedColumnSelector(" 1,2 ");
// set the desired outputs
desiredOutputSynapse.setInputArray(desiredOutputArray);
desiredOutputSynapse.setAdvancedColumnSelector(" 1 ");
// get the monitor object to train or feed forward
Monitor monitor = nnet.getMonitor();
// set the monitor parameters
monitor.setLearningRate(0.8);
monitor.setMomentum(0.3);
monitor.setTrainingPatterns(inputArray.length);
monitor.setTotCicles(5000);
monitor.setLearning(true);
long initms = System.currentTimeMillis();
// Run the network in single-thread, synchronized mode
nnet.getMonitor().setSingleThreadMode(singleThreadMode);
nnet.go(true);
System.out.println(" Total time= "
+ (System.currentTimeMillis() - initms) + " ms ");
}
private void interrogate()
{
double[][] inputArray = new double[][]
{
{ 1.0, 1.0 } };
// set the inputs
inputSynapse.setInputArray(inputArray);
inputSynapse.setAdvancedColumnSelector(" 1,2 ");
Monitor monitor = nnet.getMonitor();
monitor.setTrainingPatterns(4);
monitor.setTotCicles(1);
monitor.setLearning(false);
MemoryOutputSynapse memOut = new MemoryOutputSynapse();
// set the output synapse to write the output of the net
if (nnet != null)
{
nnet.addOutputSynapse(memOut);
System.out.println(nnet.check());
nnet.getMonitor().setSingleThreadMode(singleThreadMode);
nnet.go();
for (int i = 0; i < 4; i++)
{
double[] pattern = memOut.getNextPattern();
System.out.println(" Output pattern # " + (i + 1) + " = "
+ pattern[0]);
}
System.out.println(" Interrogating Finished ");
}
}
/**
* Method declaration
*/
protected void initNeuralNet()
{
// First create the three layers
input = new LinearLayer();
hidden = new SigmoidLayer();
output = new SigmoidLayer();
// set the dimensions of the layers
input.setRows(2);
hidden.setRows(3);
output.setRows(1);
input.setLayerName(" L.input ");
hidden.setLayerName(" L.hidden ");
output.setLayerName(" L.output ");
// Now create the two Synapses
FullSynapse synapse_IH = new FullSynapse(); /* input -> hidden conn. */
FullSynapse synapse_HO = new FullSynapse(); /* hidden -> output conn. */
// Connect the input layer whit the hidden layer
input.addOutputSynapse(synapse_IH);
hidden.addInputSynapse(synapse_IH);
// Connect the hidden layer whit the output layer
hidden.addOutputSynapse(synapse_HO);
output.addInputSynapse(synapse_HO);
// the input to the neural net
inputSynapse = new MemoryInputSynapse();
input.addInputSynapse(inputSynapse);
// The Trainer and its desired output
desiredOutputSynapse = new MemoryInputSynapse();
TeachingSynapse trainer = new TeachingSynapse();
trainer.setDesired(desiredOutputSynapse);
// Now we add this structure to a NeuralNet object
nnet = new NeuralNet();
nnet.addLayer(input, NeuralNet.INPUT_LAYER);
nnet.addLayer(hidden, NeuralNet.HIDDEN_LAYER);
nnet.addLayer(output, NeuralNet.OUTPUT_LAYER);
nnet.setTeacher(trainer);
output.addOutputSynapse(trainer);
nnet.addNeuralNetListener(this);
}
public void cicleTerminated(NeuralNetEvent e)
{
}
public void errorChanged(NeuralNetEvent e)
{
Monitor mon = (Monitor) e.getSource();
if (mon.getCurrentCicle() % 100 == 0)
System.out.println(" Epoch: "
+ (mon.getTotCicles() - mon.getCurrentCicle()) + " RMSE: "
+ mon.getGlobalError());
}
public void netStarted(NeuralNetEvent e)
{
Monitor mon = (Monitor) e.getSource();
System.out.print(" Network started for ");
if (mon.isLearning())
System.out.println(" training. ");
else
System.out.println(" interrogation. ");
}
public void netStopped(NeuralNetEvent e)
{
Monitor mon = (Monitor) e.getSource();
System.out.println(" Network stopped. Last RMSE= "
+ mon.getGlobalError());
}
public void netStoppedError(NeuralNetEvent e, String error)
{
System.out.println(" Network stopped due the following error: "
+ error);
}
}

現在我會逐步解釋上面的程序。

【1】從main方法開始說起，首先第一步新建一個對象：

[java] view plain copy

XOR_using_NeuralNet xor = new XOR_using_NeuralNet();

【2】然后初始化神經網絡：

[java] view plain copy

xor.initNeuralNet();

初始化神經網絡的方法中：

[java] view plain copy

// First create the three layers
input = new LinearLayer();
hidden = new SigmoidLayer();
output = new SigmoidLayer();
// set the dimensions of the layers
input.setRows(2);
hidden.setRows(3);
output.setRows(1);
input.setLayerName(" L.input ");
hidden.setLayerName(" L.hidden ");
output.setLayerName(" L.output ");

上面代碼解釋：

input=new LinearLayer()是新建一個輸入層，因為神經網絡的輸入層並沒有訓練參數，所以使用的是線性層；

hidden = new SigmoidLayer();這里是新建一個隱含層，使用sigmoid函數作為激勵函數，當然你也可以選擇其他的激勵函數，如softmax激勵函數

output則是新建一個輸出層

之后的三行代碼是建立輸入層、隱含層、輸出層的神經元個數，這里表示輸入層為2個神經元，隱含層是3個神經元，輸出層是1個神經元

最后的三行代碼是給每個輸出層取一個名字。

[java] view plain copy

// Now create the two Synapses
FullSynapse synapse_IH = new FullSynapse(); /* input -> hidden conn. */
FullSynapse synapse_HO = new FullSynapse(); /* hidden -> output conn. */
// Connect the input layer whit the hidden layer
input.addOutputSynapse(synapse_IH);
hidden.addInputSynapse(synapse_IH);
// Connect the hidden layer whit the output layer
hidden.addOutputSynapse(synapse_HO);
output.addInputSynapse(synapse_HO);

上面代碼解釋：

上面代碼的主要作用是將三個層連接起來，synapse_IH用來連接輸入層和隱含層，synapse_HO用來連接隱含層和輸出層

[java] view plain copy

// the input to the neural net
inputSynapse = new MemoryInputSynapse();
input.addInputSynapse(inputSynapse);
// The Trainer and its desired output
desiredOutputSynapse = new MemoryInputSynapse();
TeachingSynapse trainer = new TeachingSynapse();
trainer.setDesired(desiredOutputSynapse);

上面代碼解釋：

上面的代碼是在訓練的時候指定輸入層的數據和目的輸出的數據，

inputSynapse = new MemoryInputSynapse();這里指的是使用了從內存中輸入數據的方法，指的是輸入層輸入數據，當然還有從文件輸入的方法，這點在文章后面再談。同理，desiredOutputSynapse = new MemoryInputSynapse();也是從內存中輸入數據，指的是從輸入層應該輸出的數據

[java] view plain copy

// Now we add this structure to a NeuralNet object
nnet = new NeuralNet();
nnet.addLayer(input, NeuralNet.INPUT_LAYER);
nnet.addLayer(hidden, NeuralNet.HIDDEN_LAYER);
nnet.addLayer(output, NeuralNet.OUTPUT_LAYER);
nnet.setTeacher(trainer);
output.addOutputSynapse(trainer);
nnet.addNeuralNetListener(this);

上面代碼解釋：

這段代碼指的是將之前初始化的構件連接成一個神經網絡，NeuralNet是JOONE提供的類，主要是連接各個神經層，最后一個nnet.addNeuralNetListener(this);這個作用是對神經網絡的訓練過程進行監聽，因為這個類實現了NeuralNetListener這個接口，這個接口有一些方法，可以實現觀察神經網絡訓練過程，有助於參數調整。

【3】然后我們來看一下train這個方法：

[java] view plain copy

inputSynapse.setInputArray(inputArray);
inputSynapse.setAdvancedColumnSelector(" 1,2 ");
// set the desired outputs
desiredOutputSynapse.setInputArray(desiredOutputArray);
desiredOutputSynapse.setAdvancedColumnSelector(" 1 ");

上面代碼解釋：

inputSynapse.setInputArray(inputArray);這個方法是初始化輸入層數據，也就是指定輸入層數據的內容，inputArray是程序中給定的二維數組，這也就是為什么之前初始化神經網絡的時候使用的是MemoryInputSynapse，表示從內存中讀取數據

inputSynapse.setAdvancedColumnSelector(" 1,2 ");這個表示的是輸入層數據使用的是inputArray的前兩列數據。

desiredOutputSynapse這個也同理

[java] view plain copy

Monitor monitor = nnet.getMonitor();
// set the monitor parameters
monitor.setLearningRate(0.8);
monitor.setMomentum(0.3);
monitor.setTrainingPatterns(inputArray.length);
monitor.setTotCicles(5000);
<span style="line-height: 1.5;">monitor.setLearning(true);

上面代碼解釋：

這個monitor類也是JOONE框架提供的，主要是用來調節神經網絡的參數，monitor.setLearningRate(0.8);是用來設置神經網絡訓練的步長參數，步長越大，神經網絡梯度下降的速度越快，monitor.setTrainingPatterns(inputArray.length);這個是設置神經網絡的輸入層的訓練數據大小size，這里使用的是數組的長度；monitor.setTotCicles(5000);這個指的是設置迭代數目；monitor.setLearning(true);這個true表示是在訓練過程。

[java] view plain copy

nnet.getMonitor().setSingleThreadMode(singleThreadMode);
nnet.go(true);

上面代碼解釋：

nnet.getMonitor().setSingleThreadMode(singleThreadMode);這個指的是是不是使用多線程，但是我不太清楚這里的多線程指的是什么意思

nnet.go(true)表示的是開始訓練。

【4】最后來看一下interrogate方法

[java] view plain copy

double[][] inputArray = new double[][]
{
{ 1.0, 1.0 } };
// set the inputs
inputSynapse.setInputArray(inputArray);
inputSynapse.setAdvancedColumnSelector(" 1,2 ");
Monitor monitor = nnet.getMonitor();
monitor.setTrainingPatterns(4);
monitor.setTotCicles(1);
monitor.setLearning(false);
MemoryOutputSynapse memOut = new MemoryOutputSynapse();
// set the output synapse to write the output of the net
if (nnet != null)
{
nnet.addOutputSynapse(memOut);
System.out.println(nnet.check());
nnet.getMonitor().setSingleThreadMode(singleThreadMode);
nnet.go();
for (int i = 0; i < 4; i++)
{
double[] pattern = memOut.getNextPattern();
System.out.println(" Output pattern # " + (i + 1) + " = "
+ pattern[0]);
}
System.out.println(" Interrogating Finished ");
}

這個方法相當於測試方法，這里的inputArray是測試數據，注意這里需要設置monitor.setLearning(false);，因為這不是訓練過程，並不需要學習，monitor.setTrainingPatterns(4);這個是指測試的數量，4表示有4個測試數據（雖然這里只有一個）。這里還給nnet添加了一個輸出層數據對象，這個對象mmOut是初始測試結果，注意到之前我們初始化神經網絡的時候並沒有給輸出層指定數據對象，因為那個時候我們在訓練，而且指定了trainer作為目的輸出。

接下來就是輸出結果數據了，pattern的個數和輸出層的神經元個數一樣大，這里輸出層神經元的個數是1，所以pattern大小為1.

【5】我們看一下測試結果：

[java] view plain copy

Output pattern # 1 = 0.018303527517809233

表示輸出結果為0.01，根據sigmoid函數特性，我們得到的輸出是0，和預期結果一致。如果輸出層神經元個數大於1，那么輸出值將會有多個，因為輸出層結果是0|1離散值，所以我們取輸出最大的那個神經元的輸出值取為1，其他為0

【6】最后我們來看一下神經網絡訓練過程中的一些監聽函數：

cicleTerminated：每個循環結束后輸出的信息

errorChanged：神經網絡錯誤率變化時候輸出的信息

netStarted：神經網絡開始運行的時候輸出的信息

netStopped：神經網絡停止的時候輸出的信息

【7】好了，JOONE基本上內容就是這些。還有一些額外東西需要說明：

1，從文件中讀取數據構建神經網絡

2.如何保存訓練好的神經網絡到文件夾中，只要測試的時候直接load到內存中就行，而不用每次都需要訓練。

【8】先看第一個問題：

從文件中讀取數據：

文件的格式：

0;0;0

1;0;1

1;1;0

0;1;1

中間使用分號隔開，使用方法如下，也就是把上文的MemoryInputSynapse換成FileInputSynapse即可。

[java] view plain copy

fileInputSynapse = new FileInputSynapse();
input.addInputSynapse(fileInputSynapse);
fileDisireOutputSynapse = new FileInputSynapse();
TeachingSynapse trainer = new TeachingSynapse();
trainer.setDesired(fileDisireOutputSynapse);

我們看下文件是如何輸出數據的：

[java] view plain copy

private File inputFile = new File(Constants.TRAIN_WORD_VEC_PATH);
fileInputSynapse.setInputFile(inputFile);
fileInputSynapse.setFirstCol(2);//使用文件的第2列到第3列作為輸出層輸入
fileInputSynapse.setLastCol(3);

[java] view plain copy

fileDisireOutputSynapse.setInputFile(inputFile);
fileDisireOutputSynapse.setFirstCol(1);//使用文件的第1列作為輸出數據
fileDisireOutputSynapse.setLastCol(1);

其余的代碼和上文的是一樣的。

【9】然后看第二個問題：

如何保存神經網絡

其實很簡單，直接序列化nnet對象就行了，然后讀取該對象就是java的反序列化，這個就不多做介紹了，比較簡單。但是需要說明的是，保存神經網絡的時機一定是在神經網絡訓練完畢后，可以使用下面代碼：

[java] view plain copy

public void netStopped(NeuralNetEvent e) {
Monitor mon = (Monitor) e.getSource();
try {
if (mon.isLearning()) {
saveModel(nnet); //序列化對象
}
} catch (IOException ee) {
// TODO Auto-generated catch block
ee.printStackTrace();
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 LSTM實現股票預測 Tensorflow LSTM實現 Keras實現LSTM LSTM的推導與實現 LSTM實現文本生成 LSTM梳理，理解，和keras實現（一）雙向LSTM模型的tensorflow實現【482】Keras 實現 LSTM & BiLSTM LSTM LSTM