JAVA實現BP神經網絡算法


工作中需要預測一個過程的時間,就想到了使用BP神經網絡來進行預測。

簡介

BP神經網絡(Back Propagation Neural Network)是一種基於BP算法的人工神經網絡,其使用BP算法進行權值與閾值的調整。在20世紀80年代,幾位不同的學者分別開發出了用於訓練多層感知機的反向傳播算法,David Rumelhart和James McClelland提出的反向傳播算法是最具影響力的。其包含BP的兩大主要過程,即工作信號的正向傳播與誤差信號的反向傳播,分別負責了神經網絡中輸出的計算與權值和閾值更新。工作信號的正向傳播是通過計算得到BP神經網絡的實際輸出,誤差信號的反向傳播是由后往前逐層修正權值與閾值,為了使實際輸出更接近期望輸出。

​ (1)工作信號正向傳播。輸入信號從輸入層進入,通過突觸進入隱含層神經元,經傳遞函數運算后,傳遞到輸出層,並且在輸出層計算出輸出信號傳出。當工作信號正向傳播時,權值與閾值固定不變,神經網絡中每層的狀態只與前一層的凈輸出、權值和閾值有關。若正向傳播在輸出層獲得到期望的輸出,則學習結束,並保留當前的權值與閾值;若正向傳播在輸出層得不到期望的輸出,則在誤差信號的反向傳播中修正權值與閾值。

​ (2)誤差信號反向傳播。在工作信號正向傳播后若得不到期望的輸出,則通過計算誤差信號進行反向傳播,通過計算BP神經網絡的實際輸出與期望輸出之間的差值作為誤差信號,並且由神經網絡的輸出層,逐層向輸入層傳播。在此過程中,每向前傳播一層,就對該層的權值與閾值進行修改,由此一直向前傳播直至輸入層,該過程是為了使神經網絡的結果與期望的結果更相近。

​ 當進行一次正向傳播和反向傳播后,若誤差仍不能達到要求,則該過程繼續下去,直至誤差滿足精度,或者滿足迭代次數等其他設置的結束條件。

推導請見 https://zh.wikipedia.org/wiki/%E5%8F%8D%E5%90%91%E4%BC%A0%E6%92%AD%E7%AE%97%E6%B3%95

BPNN結構

該BPNN為單輸入層單隱含層單輸出層結構

項目結構

介紹一些用到的類

  • ActivationFunction:激活函數的接口
  • BPModel:BP模型實體類
  • BPNeuralNetworkFactory:BP神經網絡工廠,包括訓練BP神經網絡,計算,序列化等功能
  • BPParameter:BP神經網絡參數實體類
  • Matrix:矩陣實體類
  • Sigmoid:Sigmoid傳輸函數,實現了ActivationFunction接口
  • MatrixUtil:矩陣工具類

實現代碼

Matrix實體類

模擬了矩陣的基本運算方法。

package com.top.matrix;

import com.top.constants.OrderEnum;

import java.io.Serializable;

public class Matrix implements Serializable {
    private double[][] matrix;
    //矩陣列數
    private int matrixColCount;
    //矩陣行數
    private int matrixRowCount;

    /**
     * 構造一個空矩陣
     */
    public Matrix() {
        this.matrix = null;
        this.matrixColCount = 0;
        this.matrixRowCount = 0;
    }

    /**
     * 構造一個matrix矩陣
     * @param matrix
     */
    public Matrix(double[][] matrix) {
        this.matrix = matrix;
        this.matrixRowCount = matrix.length;
        this.matrixColCount = matrix[0].length;
    }

    /**
     * 構造一個rowCount行colCount列值為0的矩陣
     * @param rowCount
     * @param colCount
     */
    public Matrix(int rowCount,int colCount) {
        double[][] matrix = new double[rowCount][colCount];
        for (int i = 0; i < rowCount; i++) {
            for (int j = 0; j < colCount; j++) {
                matrix[i][j] = 0;
            }
        }
        this.matrix = matrix;
        this.matrixRowCount = rowCount;
        this.matrixColCount = colCount;
    }

    /**
     * 構造一個rowCount行colCount列值為val的矩陣
     * @param val
     * @param rowCount
     * @param colCount
     */
    public Matrix(double val,int rowCount,int colCount) {
        double[][] matrix = new double[rowCount][colCount];
        for (int i = 0; i < rowCount; i++) {
            for (int j = 0; j < colCount; j++) {
                matrix[i][j] = val;
            }
        }
        this.matrix = matrix;
        this.matrixRowCount = rowCount;
        this.matrixColCount = colCount;
    }

    public double[][] getMatrix() {
        return matrix;
    }

    public void setMatrix(double[][] matrix) {
        this.matrix = matrix;
        this.matrixRowCount = matrix.length;
        this.matrixColCount = matrix[0].length;
    }

    public int getMatrixColCount() {
        return matrixColCount;
    }

    public int getMatrixRowCount() {
        return matrixRowCount;
    }

    /**
     * 獲取矩陣指定位置的值
     *
     * @param x
     * @param y
     * @return
     */
    public double getValOfIdx(int x, int y) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (x > matrixRowCount - 1) {
            throw new IllegalArgumentException("索引x越界");
        }
        if (y > matrixColCount - 1) {
            throw new IllegalArgumentException("索引y越界");
        }
        return matrix[x][y];
    }

    /**
     * 獲取矩陣指定行
     *
     * @param x
     * @return
     */
    public Matrix getRowOfIdx(int x) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (x > matrixRowCount - 1) {
            throw new IllegalArgumentException("索引x越界");
        }
        double[][] result = new double[1][matrixColCount];
        result[0] = matrix[x];
        return new Matrix(result);
    }

    /**
     * 獲取矩陣指定列
     *
     * @param y
     * @return
     */
    public Matrix getColOfIdx(int y) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (y > matrixColCount - 1) {
            throw new IllegalArgumentException("索引y越界");
        }
        double[][] result = new double[matrixRowCount][1];
        for (int i = 0; i < matrixRowCount; i++) {
            result[i][0] = matrix[i][y];
        }
        return new Matrix(result);
    }

    /**
     * 設置矩陣中x,y位置元素的值
     * @param x
     * @param y
     * @param val
     */
    public void setValue(int x, int y, double val) {
        if (x > this.matrixRowCount - 1) {
            throw new IllegalArgumentException("行索引越界");
        }
        if (y > this.matrixColCount - 1) {
            throw new IllegalArgumentException("列索引越界");
        }
        this.matrix[x][y] = val;
    }

    /**
     * 矩陣乘矩陣
     *
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix multiple(Matrix a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (a.getMatrix() == null) {
            throw new IllegalArgumentException("參數矩陣為空");
        }
        if (matrixColCount != a.getMatrixRowCount()) {
            throw new IllegalArgumentException("矩陣緯度不同,不可計算");
        }
        double[][] result = new double[matrixRowCount][a.getMatrixColCount()];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < a.getMatrixColCount(); j++) {
                for (int k = 0; k < matrixColCount; k++) {
                    result[i][j] = result[i][j] + matrix[i][k] * a.getMatrix()[k][j];
                }
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣乘一個數字
     *
     * @param a
     * @return
     */
    public Matrix multiple(double a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] * a;
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣點乘
     *
     * @param a
     * @return
     */
    public Matrix pointMultiple(Matrix a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (a.getMatrix() == null) {
            throw new IllegalArgumentException("參數矩陣為空");
        }
        if (matrixRowCount != a.getMatrixRowCount() && matrixColCount != a.getMatrixColCount()) {
            throw new IllegalArgumentException("矩陣緯度不同,不可計算");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] * a.getMatrix()[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣除一個數字
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix divide(double a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] / a;
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣加法
     *
     * @param a
     * @return
     */
    public Matrix plus(Matrix a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (a.getMatrix() == null) {
            throw new IllegalArgumentException("參數矩陣為空");
        }
        if (matrixRowCount != a.getMatrixRowCount() && matrixColCount != a.getMatrixColCount()) {
            throw new IllegalArgumentException("矩陣緯度不同,不可計算");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] + a.getMatrix()[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣加一個數字
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix plus(double a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] + a;
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣減法
     *
     * @param a
     * @return
     */
    public Matrix subtract(Matrix a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (a.getMatrix() == null) {
            throw new IllegalArgumentException("參數矩陣為空");
        }
        if (matrixRowCount != a.getMatrixRowCount() && matrixColCount != a.getMatrixColCount()) {
            throw new IllegalArgumentException("矩陣緯度不同,不可計算");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] - a.getMatrix()[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣減一個數字
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix subtract(double a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] - a;
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣行求和
     *
     * @return
     */
    public Matrix sumRow() throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][1];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][0] += matrix[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣列求和
     *
     * @return
     */
    public Matrix sumCol() throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[1][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[0][j] += matrix[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣所有元素求和
     *
     * @return
     */
    public double sumAll() throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double result = 0;
        for (double[] doubles : matrix) {
            for (int j = 0; j < matrixColCount; j++) {
                result += doubles[j];
            }
        }
        return result;
    }

    /**
     * 矩陣所有元素求平方
     *
     * @return
     */
    public Matrix square() throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = matrix[i][j] * matrix[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣所有元素求N次方
     *
     * @return
     */
    public Matrix pow(double n) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixRowCount][matrixColCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[i][j] = Math.pow(matrix[i][j],n);
            }
        }
        return new Matrix(result);
    }

    /**
     * 矩陣轉置
     *
     * @return
     */
    public Matrix transpose() throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        double[][] result = new double[matrixColCount][matrixRowCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixColCount; j++) {
                result[j][i] = matrix[i][j];
            }
        }
        return new Matrix(result);
    }

    /**
     * 截取矩陣
     * @param startRowIndex 開始行索引
     * @param rowCount   截取行數
     * @param startColIndex 開始列索引
     * @param colCount   截取列數
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix subMatrix(int startRowIndex,int rowCount,int startColIndex,int colCount) throws IllegalArgumentException {
        if (startRowIndex + rowCount > matrixRowCount) {
            throw new IllegalArgumentException("行索引越界");
        }
        if (startColIndex + colCount> matrixColCount) {
            throw new IllegalArgumentException("列索引越界");
        }
        double[][] result = new double[rowCount][colCount];
        for (int i = startRowIndex; i < startRowIndex + rowCount; i++) {
            if (startColIndex + colCount - startColIndex >= 0)
                System.arraycopy(matrix[i], startColIndex, result[i - startRowIndex], 0, colCount);
        }
        return new Matrix(result);
    }

    /**
     * 矩陣合並
     * @param direction 合並方向,1為橫向,2為豎向
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix splice(int direction, Matrix a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if (a.getMatrix() == null) {
            throw new IllegalArgumentException("參數矩陣為空");
        }
        if(direction == 1){
            //橫向拼接
            if (matrixRowCount != a.getMatrixRowCount()) {
                throw new IllegalArgumentException("矩陣行數不一致,無法拼接");
            }
            double[][] result = new double[matrixRowCount][matrixColCount + a.getMatrixColCount()];
            for (int i = 0; i < matrixRowCount; i++) {
                System.arraycopy(matrix[i],0,result[i],0,matrixColCount);
                System.arraycopy(a.getMatrix()[i],0,result[i],matrixColCount,a.getMatrixColCount());
            }
            return new Matrix(result);
        }else if(direction == 2){
            //縱向拼接
            if (matrixColCount != a.getMatrixColCount()) {
                throw new IllegalArgumentException("矩陣列數不一致,無法拼接");
            }
            double[][] result = new double[matrixRowCount + a.getMatrixRowCount()][matrixColCount];
            for (int i = 0; i < matrixRowCount; i++) {
                result[i] = matrix[i];
            }
            for (int i = 0; i < a.getMatrixRowCount(); i++) {
                result[matrixRowCount + i] = a.getMatrix()[i];
            }
            return new Matrix(result);
        }else{
            throw new IllegalArgumentException("方向參數有誤");
        }
    }
    /**
     * 擴展矩陣
     * @param direction 擴展方向,1為橫向,2為豎向
     * @param a
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix extend(int direction , int a) throws IllegalArgumentException {
        if (matrix == null) {
            throw new IllegalArgumentException("矩陣為空");
        }
        if(direction == 1){
            //橫向復制
            double[][] result = new double[matrixRowCount][matrixColCount*a];
            for (int i = 0; i < matrixRowCount; i++) {
                for (int j = 0; j < a; j++) {
                    System.arraycopy(matrix[i],0,result[i],j*matrixColCount,matrixColCount);
                }
            }
            return new Matrix(result);
        }else if(direction == 2){
            //縱向復制
            double[][] result = new double[matrixRowCount*a][matrixColCount];
            for (int i = 0; i < matrixRowCount*a; i++) {
                result[i] = matrix[i%matrixRowCount];
            }
            return new Matrix(result);
        }else{
            throw new IllegalArgumentException("方向參數有誤");
        }
    }
    /**
     * 獲取每列的平均值
     * @return
     * @throws IllegalArgumentException
     */
    public Matrix getColAvg() throws IllegalArgumentException {
        Matrix tmp = this.sumCol();
        return tmp.divide(matrixRowCount);
    }

    /**
     * 矩陣行排序
     * @param index 根據第幾列的數進行行排序
     * @param order 排序順序,升序或降序
     * @return
     * @throws IllegalArgumentException
     */
    public void sort(int index,OrderEnum order) throws IllegalArgumentException{
        switch (order){
            case ASC:
                for (int i = 0; i < this.matrixRowCount; i++) {
                    for (int j = 0; j < this.matrixRowCount - 1 - i; j++) {
                        if (this.matrix[j][index] > this.matrix[j + 1][index]) {
                            double[] tmp = this.matrix[j];
                            this.matrix[j] = this.matrix[j + 1];
                            this.matrix[j + 1] = tmp;
                        }
                    }
                }
                break;
            case DESC:
                for (int i = 0; i < this.matrixRowCount; i++) {
                    for (int j = 0; j < this.matrixRowCount - 1 - i; j++) {
                        if (this.matrix[j][index] < this.matrix[j + 1][index]) {
                            double[] tmp = this.matrix[j];
                            this.matrix[j] = this.matrix[j + 1];
                            this.matrix[j + 1] = tmp;
                        }
                    }
                }
                break;
            default:

        }

    }

    /**
     * 判斷是否是方陣
     * 行列數相等,並且不等於0
     * @return
     */
    public boolean isSquareMatrix(){
        return matrixColCount == matrixRowCount && matrixColCount != 0;
    }

    @Override
    public String toString() {
        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append("\r\n");
        for (int i = 0; i < matrixRowCount; i++) {
            stringBuilder.append("# ");
            for (int j = 0; j < matrixColCount; j++) {
                stringBuilder.append(matrix[i][j]).append("\t ");
            }
            stringBuilder.append("#\r\n");
        }
        stringBuilder.append("\r\n");
        return stringBuilder.toString();
    }
}
Matrix代碼

MatrixUtil工具類

package com.top.utils;

import com.top.matrix.Matrix;

import java.util.*;

public class MatrixUtil {
    /**
     * 創建一個單位矩陣
     * @param matrixRowCount 單位矩陣的緯度
     * @return
     */
    public static Matrix eye(int matrixRowCount){
        double[][] result = new double[matrixRowCount][matrixRowCount];
        for (int i = 0; i < matrixRowCount; i++) {
            for (int j = 0; j < matrixRowCount; j++) {
                if(i == j){
                    result[i][j] = 1;
                }else{
                    result[i][j] = 0;
                }
            }
        }
        return new Matrix(result);
    }

    /**
     * 求矩陣的逆
     * 原理:AE=EA^-1
     * @param a
     * @return
     * @throws Exception
     */
    public static Matrix inv(Matrix a) throws Exception {
        if (!invable(a)) {
            throw new Exception("矩陣不可逆");
        }
        // [a|E]
        Matrix b = a.splice(1, eye(a.getMatrixRowCount()));
        double[][] data = b.getMatrix();
        int rowCount = b.getMatrixRowCount();
        int colCount = b.getMatrixColCount();
        //此處應用a的列數,為簡化,直接用b的行數
        for (int j = 0; j < rowCount; j++) {
            //若遇到0則交換兩行
            int notZeroRow = -2;
            if(data[j][j] == 0){
                notZeroRow = -1;
                for (int l = j; l < rowCount; l++) {
                    if (data[l][j] != 0) {
                        notZeroRow = l;
                        break;
                    }
                }
            }
            if (notZeroRow == -1) {
                throw new Exception("矩陣不可逆");
            }else if(notZeroRow != -2){
                //交換j與notZeroRow兩行
                double[] tmp = data[j];
                data[j] = data[notZeroRow];
                data[notZeroRow] = tmp;
            }
            //將第data[j][j]化為1
            if (data[j][j] != 1) {
                double multiple = data[j][j];
                for (int colIdx = j; colIdx < colCount; colIdx++) {
                    data[j][colIdx] /= multiple;
                }
            }
            //行與行相減
            for (int i = 0; i < rowCount; i++) {
                if (i != j) {
                    double multiple = data[i][j] / data[j][j];
                    //遍歷行中的列
                    for (int k = j; k < colCount; k++) {
                        data[i][k] = data[i][k] - multiple * data[j][k];
                    }
                }
            }
        }
        Matrix result = new Matrix(data);
        return result.subMatrix(0, rowCount, rowCount, rowCount);
    }

    /**
     * 求矩陣的伴隨矩陣
     * 原理:A*=|A|A^-1
     * @param a
     * @return
     * @throws Exception
     */
    public static Matrix adj(Matrix a) throws Exception {
        return inv(a).multiple(det(a));
    }

    /**
     * 矩陣轉成上三角矩陣
     * @param a
     * @return
     * @throws Exception
     */
    public static Matrix getTopTriangle(Matrix a) throws Exception {
        if (!a.isSquareMatrix()) {
            throw new Exception("不是方陣無法進行計算");
        }
        int matrixHeight = a.getMatrixRowCount();
        double[][] result = a.getMatrix();
        //遍歷列
        for (int j = 0; j < matrixHeight; j++) {
            //遍歷行
            for (int i = j+1; i < matrixHeight; i++) {
                //若遇到0則交換兩行
                int notZeroRow = -2;
                if(result[j][j] == 0){
                    notZeroRow = -1;
                    for (int l = i; l < matrixHeight; l++) {
                        if (result[l][j] != 0) {
                            notZeroRow = l;
                            break;
                        }
                    }
                }
                if (notZeroRow == -1) {
                    throw new Exception("矩陣不可逆");
                }else if(notZeroRow != -2){
                    //交換j與notZeroRow兩行
                    double[] tmp = result[j];
                    result[j] = result[notZeroRow];
                    result[notZeroRow] = tmp;
                }

                double multiple = result[i][j]/result[j][j];
                //遍歷行中的列
                for (int k = j; k < matrixHeight; k++) {
                    result[i][k] = result[i][k] - multiple * result[j][k];
                }
            }
        }
        return new Matrix(result);
    }

    /**
     * 計算矩陣的行列式
     * @param a
     * @return
     * @throws Exception
     */
    public static double det(Matrix a) throws Exception {
        //將矩陣轉成上三角矩陣
        Matrix b = MatrixUtil.getTopTriangle(a);
        double result = 1;
        //計算矩陣行列式
        for (int i = 0; i < b.getMatrixRowCount(); i++) {
            result *= b.getValOfIdx(i, i);
        }
        return result;
    }
    /**
     * 獲取協方差矩陣
     * @param a
     * @return
     * @throws Exception
     */
    public static Matrix cov(Matrix a) throws Exception {
        if (a.getMatrix() == null) {
            throw new Exception("矩陣為空");
        }
        Matrix avg = a.getColAvg().extend(2, a.getMatrixRowCount());
        Matrix tmp = a.subtract(avg);
        return tmp.transpose().multiple(tmp).multiple(1/((double) a.getMatrixRowCount() -1));
    }

    /**
     * 判斷矩陣是否可逆
     * 如果可轉為上三角矩陣則可逆
     * @param a
     * @return
     */
    public static boolean invable(Matrix a) {
        try {
            getTopTriangle(a);
            return true;
        } catch (Exception e) {
            return false;
        }
    }


    /**
     * 數據歸一化
     * @param a 要歸一化的數據
     * @param normalizationMin  要歸一化的區間下限
     * @param normalizationMax  要歸一化的區間上限
     * @return
     */
    public static Map<String, Object> normalize(Matrix a, double normalizationMin, double normalizationMax) throws Exception {
        HashMap<String, Object> result = new HashMap<>();
        double[][] maxArr = new double[1][a.getMatrixColCount()];
        double[][] minArr = new double[1][a.getMatrixColCount()];
        double[][] res = new double[a.getMatrixRowCount()][a.getMatrixColCount()];
        for (int i = 0; i < a.getMatrixColCount(); i++) {
            List tmp = new ArrayList();
            for (int j = 0; j < a.getMatrixRowCount(); j++) {
                tmp.add(a.getValOfIdx(j,i));
            }
            double max = (double) Collections.max(tmp);
            double min = (double) Collections.min(tmp);
            //數據歸一化(注:若max與min均為0則不需要歸一化)
            if (max != 0 || min != 0) {
                for (int j = 0; j < a.getMatrixRowCount(); j++) {
                    res[j][i] = normalizationMin + (a.getValOfIdx(j,i) - min) / (max - min) * (normalizationMax - normalizationMin);
                }
            }
            maxArr[0][i] = max;
            minArr[0][i] = min;
        }
        result.put("max", new Matrix(maxArr));
        result.put("min", new Matrix(minArr));
        result.put("res", new Matrix(res));
        return result;
    }

    /**
     * 反歸一化
     * @param a 要反歸一化的數據
     * @param normalizationMin 要反歸一化的區間下限
     * @param normalizationMax 要反歸一化的區間上限
     * @param dataMax   數據最大值
     * @param dataMin   數據最小值
     * @return
     */
    public static Matrix inverseNormalize(Matrix a, double normalizationMax, double normalizationMin , Matrix dataMax,Matrix dataMin){
        double[][] res = new double[a.getMatrixRowCount()][a.getMatrixColCount()];
        for (int i = 0; i < a.getMatrixColCount(); i++) {
            //數據反歸一化
            if (dataMin.getValOfIdx(0,i) != 0 || dataMax.getValOfIdx(0,i) != 0) {
                for (int j = 0; j < a.getMatrixRowCount(); j++) {
                    res[j][i] = dataMin.getValOfIdx(0,i) + (dataMax.getValOfIdx(0,i) - dataMin.getValOfIdx(0,i)) * (a.getValOfIdx(j,i) - normalizationMin) / (normalizationMax - normalizationMin);
                }
            }
        }
        return new Matrix(res);
    }
}
MatrixUtil工具類

ActivationFunction接口

public interface ActivationFunction {
    //計算值
    double computeValue(double val);
    //計算導數
    double computeDerivative(double val);
}
ActivationFunction代碼

Sigmoid

import java.io.Serializable;

public class Sigmoid implements ActivationFunction, Serializable {
    @Override
    public double computeValue(double val) {
        return 1 / (1 + Math.exp(-val));
    }

    @Override
    public double computeDerivative(double val) {
        return computeValue(val) * (1 - computeValue(val));
    }
}
Sigmoid代碼

BPParameter

包含了BP神經網絡訓練所需的參數

package com.top.bpnn;

import java.io.Serializable;

public class BPParameter implements Serializable {

    //輸入層神經元個數
    private int inputLayerNeuronCount = 3;
    //隱含層神經元個數
    private int hiddenLayerNeuronCount = 3;
    //輸出層神經元個數
    private int outputLayerNeuronCount = 1;
    //歸一化區間
    private double normalizationMin = 0.2;
    private double normalizationMax = 0.8;
    //學習步長
    private double step = 0.05;
    //動量因子
    private double momentumFactor = 0.2;
    //激活函數
    private ActivationFunction activationFunction = new Sigmoid();
    //精度
    private double precision = 0.000001;
    //最大循環次數
    private int maxTimes = 1000000;

    public double getMomentumFactor() {
        return momentumFactor;
    }

    public void setMomentumFactor(double momentumFactor) {
        this.momentumFactor = momentumFactor;
    }

    public double getStep() {
        return step;
    }

    public void setStep(double step) {
        this.step = step;
    }

    public double getNormalizationMin() {
        return normalizationMin;
    }

    public void setNormalizationMin(double normalizationMin) {
        this.normalizationMin = normalizationMin;
    }

    public double getNormalizationMax() {
        return normalizationMax;
    }

    public void setNormalizationMax(double normalizationMax) {
        this.normalizationMax = normalizationMax;
    }

    public int getInputLayerNeuronCount() {
        return inputLayerNeuronCount;
    }

    public void setInputLayerNeuronCount(int inputLayerNeuronCount) {
        this.inputLayerNeuronCount = inputLayerNeuronCount;
    }

    public int getHiddenLayerNeuronCount() {
        return hiddenLayerNeuronCount;
    }

    public void setHiddenLayerNeuronCount(int hiddenLayerNeuronCount) {
        this.hiddenLayerNeuronCount = hiddenLayerNeuronCount;
    }

    public int getOutputLayerNeuronCount() {
        return outputLayerNeuronCount;
    }

    public void setOutputLayerNeuronCount(int outputLayerNeuronCount) {
        this.outputLayerNeuronCount = outputLayerNeuronCount;
    }

    public ActivationFunction getActivationFunction() {
        return activationFunction;
    }

    public void setActivationFunction(ActivationFunction activationFunction) {
        this.activationFunction = activationFunction;
    }

    public double getPrecision() {
        return precision;
    }

    public void setPrecision(double precision) {
        this.precision = precision;
    }

    public int getMaxTimes() {
        return maxTimes;
    }

    public void setMaxTimes(int maxTimes) {
        this.maxTimes = maxTimes;
    }
}
BPParameter代碼

BPModel

BP神經網絡模型,包括權值與閾值及訓練參數等屬性

package com.top.bpnn;

import com.top.matrix.Matrix;

import java.io.Serializable;

public class BPModel implements Serializable {
    //BP神經網絡權值與閾值
    private Matrix weightIJ;
    private Matrix b1;
    private Matrix weightJP;
    private Matrix b2;
    /*用於反歸一化*/
    private Matrix inputMax;
    private Matrix inputMin;
    private Matrix outputMax;
    private Matrix outputMin;
    /*BP神經網絡訓練參數*/
    private BPParameter bpParameter;
    /*BP神經網絡訓練情況*/
    private double error;
    private int times;

    public Matrix getWeightIJ() {
        return weightIJ;
    }

    public void setWeightIJ(Matrix weightIJ) {
        this.weightIJ = weightIJ;
    }

    public Matrix getB1() {
        return b1;
    }

    public void setB1(Matrix b1) {
        this.b1 = b1;
    }

    public Matrix getWeightJP() {
        return weightJP;
    }

    public void setWeightJP(Matrix weightJP) {
        this.weightJP = weightJP;
    }

    public Matrix getB2() {
        return b2;
    }

    public void setB2(Matrix b2) {
        this.b2 = b2;
    }

    public Matrix getInputMax() {
        return inputMax;
    }

    public void setInputMax(Matrix inputMax) {
        this.inputMax = inputMax;
    }

    public Matrix getInputMin() {
        return inputMin;
    }

    public void setInputMin(Matrix inputMin) {
        this.inputMin = inputMin;
    }

    public Matrix getOutputMax() {
        return outputMax;
    }

    public void setOutputMax(Matrix outputMax) {
        this.outputMax = outputMax;
    }

    public Matrix getOutputMin() {
        return outputMin;
    }

    public void setOutputMin(Matrix outputMin) {
        this.outputMin = outputMin;
    }

    public BPParameter getBpParameter() {
        return bpParameter;
    }

    public void setBpParameter(BPParameter bpParameter) {
        this.bpParameter = bpParameter;
    }

    public double getError() {
        return error;
    }

    public void setError(double error) {
        this.error = error;
    }

    public int getTimes() {
        return times;
    }

    public void setTimes(int times) {
        this.times = times;
    }
}
BPModel代碼

BPNeuralNetworkFactory

BP神經網絡工廠,包含了BP神經網絡訓練等功能

package com.top.bpnn;

import com.top.matrix.Matrix;
import com.top.utils.MatrixUtil;

import java.util.*;

public class BPNeuralNetworkFactory {
    /**
     * 訓練BP神經網絡模型
     * @param bpParameter
     * @param inputAndOutput
     * @return
     */
    public BPModel trainBP(BPParameter bpParameter, Matrix inputAndOutput) throws Exception {

        ActivationFunction activationFunction = bpParameter.getActivationFunction();
        int inputCount = bpParameter.getInputLayerNeuronCount();
        int hiddenCount = bpParameter.getHiddenLayerNeuronCount();
        int outputCount = bpParameter.getOutputLayerNeuronCount();
        double normalizationMin = bpParameter.getNormalizationMin();
        double normalizationMax = bpParameter.getNormalizationMax();
        double step = bpParameter.getStep();
        double momentumFactor = bpParameter.getMomentumFactor();
        double precision = bpParameter.getPrecision();
        int maxTimes = bpParameter.getMaxTimes();

        if(inputAndOutput.getMatrixColCount() != inputCount + outputCount){
            throw new Exception("神經元個數不符,請修改");
        }
        // 初始化權值
        Matrix weightIJ = initWeight(inputCount, hiddenCount);
        Matrix weightJP = initWeight(hiddenCount, outputCount);

        // 初始化閾值
        Matrix b1 = initThreshold(hiddenCount);
        Matrix b2 = initThreshold(outputCount);

        // 動量項
        Matrix deltaWeightIJ0 = new Matrix(inputCount, hiddenCount);
        Matrix deltaWeightJP0 = new Matrix(hiddenCount, outputCount);
        Matrix deltaB10 = new Matrix(1, hiddenCount);
        Matrix deltaB20 = new Matrix(1, outputCount);

        // 截取輸入矩陣和輸出矩陣
        Matrix input = inputAndOutput.subMatrix(0,inputAndOutput.getMatrixRowCount(),0,inputCount);
        Matrix output = inputAndOutput.subMatrix(0,inputAndOutput.getMatrixRowCount(),inputCount,outputCount);

        // 歸一化
        Map<String,Object> inputAfterNormalize = MatrixUtil.normalize(input, normalizationMin, normalizationMax);
        input = (Matrix) inputAfterNormalize.get("res");

        Map<String,Object> outputAfterNormalize = MatrixUtil.normalize(output, normalizationMin, normalizationMax);
        output = (Matrix) outputAfterNormalize.get("res");

        int times = 1;
        double E = 0;//誤差
        while (times < maxTimes) {
            /*-----------------正向傳播---------------------*/
            // 隱含層輸入
            Matrix jIn = input.multiple(weightIJ);
            // 擴充閾值
            Matrix b1Copy = b1.extend(2,jIn.getMatrixRowCount());
            // 加上閾值
            jIn = jIn.plus(b1Copy);
            // 隱含層輸出
            Matrix jOut = computeValue(jIn,activationFunction);
            // 輸出層輸入
            Matrix pIn = jOut.multiple(weightJP);
            // 擴充閾值
            Matrix b2Copy = b2.extend(2, pIn.getMatrixRowCount());
            // 加上閾值
            pIn = pIn.plus(b2Copy);
            // 輸出層輸出
            Matrix pOut = computeValue(pIn,activationFunction);
            // 計算誤差
            Matrix e = output.subtract(pOut);
            E = computeE(e);//誤差
            // 判斷是否符合精度
            if (Math.abs(E) <= precision) {
                System.out.println("滿足精度");
                break;
            }

            /*-----------------反向傳播---------------------*/
            // J與P之間權值修正量
            Matrix deltaWeightJP = e.multiple(step);
            deltaWeightJP = deltaWeightJP.pointMultiple(computeDerivative(pIn,activationFunction));
            deltaWeightJP = deltaWeightJP.transpose().multiple(jOut);
            deltaWeightJP = deltaWeightJP.transpose();
            // P層神經元閾值修正量
            Matrix deltaThresholdP = e.multiple(step);
            deltaThresholdP = deltaThresholdP.transpose().multiple(computeDerivative(pIn, activationFunction));

            // I與J之間的權值修正量
            Matrix deltaO = e.pointMultiple(computeDerivative(pIn,activationFunction));
            Matrix tmp = weightJP.multiple(deltaO.transpose()).transpose();
            Matrix deltaWeightIJ = tmp.pointMultiple(computeDerivative(jIn, activationFunction));
            deltaWeightIJ = input.transpose().multiple(deltaWeightIJ);
            deltaWeightIJ = deltaWeightIJ.multiple(step);

            // J層神經元閾值修正量
            Matrix deltaThresholdJ = tmp.transpose().multiple(computeDerivative(jIn, activationFunction));
            deltaThresholdJ = deltaThresholdJ.multiple(-step);

            if (times == 1) {
                // 更新權值與閾值
                weightIJ = weightIJ.plus(deltaWeightIJ);
                weightJP = weightJP.plus(deltaWeightJP);
                b1 = b1.plus(deltaThresholdJ);
                b2 = b2.plus(deltaThresholdP);
            }else{
                // 加動量項
                weightIJ = weightIJ.plus(deltaWeightIJ).plus(deltaWeightIJ0.multiple(momentumFactor));
                weightJP = weightJP.plus(deltaWeightJP).plus(deltaWeightJP0.multiple(momentumFactor));
                b1 = b1.plus(deltaThresholdJ).plus(deltaB10.multiple(momentumFactor));
                b2 = b2.plus(deltaThresholdP).plus(deltaB20.multiple(momentumFactor));
            }

            deltaWeightIJ0 = deltaWeightIJ;
            deltaWeightJP0 = deltaWeightJP;
            deltaB10 = deltaThresholdJ;
            deltaB20 = deltaThresholdP;

            times++;
        }

        // BP神經網絡的輸出
        BPModel result = new BPModel();
        result.setInputMax((Matrix) inputAfterNormalize.get("max"));
        result.setInputMin((Matrix) inputAfterNormalize.get("min"));
        result.setOutputMax((Matrix) outputAfterNormalize.get("max"));
        result.setOutputMin((Matrix) outputAfterNormalize.get("min"));
        result.setWeightIJ(weightIJ);
        result.setWeightJP(weightJP);
        result.setB1(b1);
        result.setB2(b2);
        result.setError(E);
        result.setTimes(times);
        result.setBpParameter(bpParameter);
        System.out.println("循環次數:" + times + ",誤差:" + E);

        return result;
    }

    /**
     * 計算BP神經網絡的值
     * @param bpModel
     * @param input
     * @return
     */
    public Matrix computeBP(BPModel bpModel,Matrix input) throws Exception {
        if (input.getMatrixColCount() != bpModel.getBpParameter().getInputLayerNeuronCount()) {
            throw new Exception("輸入矩陣緯度有誤");
        }
        ActivationFunction activationFunction = bpModel.getBpParameter().getActivationFunction();
        Matrix weightIJ = bpModel.getWeightIJ();
        Matrix weightJP = bpModel.getWeightJP();
        Matrix b1 = bpModel.getB1();
        Matrix b2 = bpModel.getB2();
        double[][] normalizedInput = new double[input.getMatrixRowCount()][input.getMatrixColCount()];
        for (int i = 0; i < input.getMatrixRowCount(); i++) {
            for (int j = 0; j < input.getMatrixColCount(); j++) {
                normalizedInput[i][j] = bpModel.getBpParameter().getNormalizationMin()
                        + (input.getValOfIdx(i,j) - bpModel.getInputMin().getValOfIdx(0,j))
                        / (bpModel.getInputMax().getValOfIdx(0,j) - bpModel.getInputMin().getValOfIdx(0,j))
                        * (bpModel.getBpParameter().getNormalizationMax() - bpModel.getBpParameter().getNormalizationMin());
            }
        }
        Matrix normalizedInputMatrix = new Matrix(normalizedInput);
        Matrix jIn = normalizedInputMatrix.multiple(weightIJ);
        // 擴充閾值
        Matrix b1Copy = b1.extend(2,jIn.getMatrixRowCount());
        // 加上閾值
        jIn = jIn.plus(b1Copy);
        // 隱含層輸出
        Matrix jOut = computeValue(jIn,activationFunction);
        // 輸出層輸入
        Matrix pIn = jOut.multiple(weightJP);
        // 擴充閾值
        Matrix b2Copy = b2.extend(2,pIn.getMatrixRowCount());
        // 加上閾值
        pIn = pIn.plus(b2Copy);
        // 輸出層輸出
        Matrix pOut = computeValue(pIn,activationFunction);
        // 反歸一化
        return MatrixUtil.inverseNormalize(pOut, bpModel.getBpParameter().getNormalizationMax(), bpModel.getBpParameter().getNormalizationMin(), bpModel.getOutputMax(), bpModel.getOutputMin());
    }

    // 初始化權值
    private Matrix initWeight(int x,int y){
        Random random=new Random();
        double[][] weight = new double[x][y];
        for (int i = 0; i < x; i++) {
            for (int j = 0; j < y; j++) {
                weight[i][j] = 2*random.nextDouble()-1;
            }
        }
        return new Matrix(weight);
    }
    // 初始化閾值
    private Matrix initThreshold(int x){
        Random random = new Random();
        double[][] result = new double[1][x];
        for (int i = 0; i < x; i++) {
            result[0][i] = 2*random.nextDouble()-1;
        }
        return new Matrix(result);
    }

    /**
     * 計算激活函數的值
     * @param a
     * @return
     */
    private Matrix computeValue(Matrix a, ActivationFunction activationFunction) throws Exception {
        if (a.getMatrix() == null) {
            throw new Exception("參數值為空");
        }
        double[][] result = new double[a.getMatrixRowCount()][a.getMatrixColCount()];
        for (int i = 0; i < a.getMatrixRowCount(); i++) {
            for (int j = 0; j < a.getMatrixColCount(); j++) {
                result[i][j] = activationFunction.computeValue(a.getValOfIdx(i,j));
            }
        }
        return new Matrix(result);
    }

    /**
     * 激活函數導數的值
     * @param a
     * @return
     */
    private Matrix computeDerivative(Matrix a , ActivationFunction activationFunction) throws Exception {
        if (a.getMatrix() == null) {
            throw new Exception("參數值為空");
        }
        double[][] result = new double[a.getMatrixRowCount()][a.getMatrixColCount()];
        for (int i = 0; i < a.getMatrixRowCount(); i++) {
            for (int j = 0; j < a.getMatrixColCount(); j++) {
                result[i][j] = activationFunction.computeDerivative(a.getValOfIdx(i,j));
            }
        }
        return new Matrix(result);
    }


    /**
     * 計算誤差
     * @param e
     * @return
     */
    private double computeE(Matrix e){
        e = e.square();
        return 0.5*e.sumAll();
    }
}
BPNeuralNetworkFactory代碼

 

使用方式

思路就是創建BPNeuralNetworkFactory對象,並傳入BPParameter對象,調用BPNeuralNetworkFactory的trainBP(BPParameter bpParameter, Matrix inputAndOutput)方法,返回一個BPModel對象,可以使用BPNeuralNetworkFactory的序列化方法,將其序列化到本地,或者將其放到緩存中,使用時直接從本地反序列化獲取到BPModel對象,調用BPNeuralNetworkFactory的computeBP(BPModel bpModel,Matrix input)方法,即可獲取計算值。

使用詳情請看:https://github.com/ineedahouse/top-algorithm-set-doc/blob/master/doc/bpnn/BPNeuralNetwork.md

源碼github地址

https://github.com/ineedahouse/top-algorithm-set

對您有幫助的話,請點個Star~謝謝

 

參考:基於BP神經網絡的無約束優化方法研究及應用[D]. 趙逸翔.東北農業大學 2019


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM