R_Studio(神經網絡)BP神經網絡算法預測銷量的高低


 

 

  BP神經網絡  百度百科:傳送門

  BP(back propagation)神經網絡:一種按照誤差逆向傳播算法訓練的多層前饋神經網絡,是目前應用最廣泛的神經網絡

 

  

 

#設置文件工作區間
setwd('D:\\dat') 
#讀入數據
Gary=read.csv("sales_data.csv")[,2:5]
#數據命名
library(nnet)
colnames(Gary)<-c("x1","x2","x3","y")

###最終模型
model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000)  

pred=predict(model1,Gary[,1:3],type="class")    
(P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
table(Gary$y,pred)
prop.table(table(Gary$y,pred),1)
Gary.Script

 

 

實現過程

 

  目的:通過BP神經網絡預測銷量的高低

 

  數據預處理,對數據進行重命名並去除無關項

> #設置文件工作區間
> setwd('D:\\dat') 
> #讀入數據
> Gary=read.csv("sales_data.csv")[,2:5]
> #數據命名
> library(nnet)
> colnames(Gary)<-c("x1","x2","x3","y")
> Gary
     x1  x2  x3    y
1   bad yes yes high
2   bad yes yes high
3   bad yes yes high
4   bad  no yes high
5   bad yes yes high
6   bad  no yes high
7   bad yes  no high
8  good yes yes high
9  good yes  no high
10 good yes yes high
11 good yes yes high
12 good yes yes high
13 good yes yes high
14  bad yes yes  low
15 good  no yes high
16 good  no yes high
17 good  no yes high
18 good  no yes high
19 good  no  no high
20  bad  no  no  low
21  bad  no yes  low
22  bad  no yes  low
23  bad  no yes  low
24  bad  no  no  low
25  bad yes  no  low
26 good  no yes  low
27 good  no yes  low
28  bad  no  no  low
29  bad  no  no  low
30 good  no  no  low
31  bad yes  no  low
32 good  no yes  low
33 good  no  no  low
34 good  no  no  low

 

  nnet:包實現了前饋神經網絡和多項對數線性模型。前饋神經網絡是一種常用的神經網絡結構,如下圖所示

  

  前饋網絡中各個神經元按接受信息的先后分為不同的組。每一組可以看作一個神經層。每一層中的神經元接受前一層神經元的輸出,並輸出到下一層神經元。整個網絡中的信息是朝一個方向傳播,沒有反向的信息傳播。前饋網絡可以用一個有向無環路圖表示。前饋網絡可以看作一個函數,通過簡單非線性函數的多次復合,實現輸入空間到輸出空間的復雜映射。這種網絡結構簡單,易於實現。前饋網絡包括全連接前饋網絡和卷積神經網絡等

 

使用neet()方法創建模型

 

  neet()方法:

  decay: 權重衰減parameter for weight decay. Default 0.

  maxit: 最大迭代次數

 

x: 訓練樣本數據集的輸入集合
y: x對應的訓練樣本數據集的標簽(類)集合
weights: 
size: 隱層節點數, Can be zero if there are skip-layer units.
data:訓練數據集.
subset: An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
na.action: A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)
contrasts:a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
Wts: 邊的權重. If missing chosen at random.
mask: logical vector indicating which parameters should be optimized (default all).
linout: 是否為邏輯輸出單元,若為FALSE,則為線性輸出單元
entropy: switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares.
softmax: switch for softmax (log-linear model) and maximum conditional likelihood fitting. linout, entropy, softmax and censored are mutually exclusive.
censored: A variant on softmax, in which non-zero targets mean possible classes. Thus for softmax a row of (0, 1, 1) means one example each of classes 2 and 3, but for censored it means one example whose class is only known to be 2 or 3.
skip: switch to add skip-layer connections from input to output.
rang: Initial random weights on [-rang, rang]. Value about 0.5 unless the inputs are large, in which case it should be chosen so that rang * max(|x|) is about 1.
decay: 權重衰減parameter for weight decay. Default 0.
maxit: 最大迭代次數
Hess: If true, the Hessian of the measure of fit at the best set of weights found is returned as component Hessian.
trace: switch for tracing optimization. Default TRUE.
MaxNWts: The maximum allowable number of weights. There is no intrinsic limit in the code, but increasing MaxNWts will probably allow fits that are very slow and time-consuming.
abstol: Stop if the fit criterion falls below abstol, indicating an essentially perfect fit.
reltol: Stop if the optimizer is unable to reduce the fit criterion by a factor of at least 1 - reltol.
neet()方法

 

model1=nnet(y~.,data=Gary,size=6,decay=5e-4,maxit=1000)
# weights:  31
initial  value 27.073547 
iter  10 value 16.080731
iter  20 value 15.038060
iter  30 value 14.937127
iter  40 value 14.917485
iter  50 value 14.911531
iter  60 value 14.908678
iter  70 value 14.907836
iter  80 value 14.905234
iter  90 value 14.904499
iter 100 value 14.904028
iter 110 value 14.903688
iter 120 value 14.903480
iter 130 value 14.903450
iter 130 value 14.903450
iter 130 value 14.903450
final  value 14.903450 
converged

 

評估模型

> pred=predict(model1,Gary[,1:3],type="class")
> (P=sum(as.numeric(pred==Gary$y))/nrow(Gary))
[1] 0.7647059
> table(Gary$y,pred)
      pred
       high low
  high   14   4
  low     4  12
> prop.table(table(Gary$y,pred),1)
      pred
            high       low
  high 0.7777778 0.2222222
  low  0.2500000 0.7500000

 

  得到混淆矩陣圖后可以看出

  檢測樣本為34個,預測正確的個數為26個,預測准確率為76.5%,預測准確率較低

  原因:由於神經網絡訓練時需要較多的樣本,而這里訓練集比較少 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM