Logistic回歸

算法優缺點：

1.計算代價不高，易於理解和實現
2.容易欠擬合，分類精度可能不高
3.適用數據類型：數值型和標稱型

算法思想：

其實就我的理解來說，logistic回歸實際上就是加了個sigmoid函數的線性回歸，這個sigmoid函數的好處就在於，將結果歸到了0到1這個區間里面了，並且sigmoid（0）=0.5，也就是說里面的線性部分的結果大於零小於零就可以直接計算到了。這里的求解方式是梯度上升法，具體我就不扯了，最推薦的資料還是Ng的視頻，那里面的梯度下降就是啦，只不過一個是梯度上升的方向一個是下降的方向，做法什么的都一樣。
而梯度上升（准確的說叫做“批梯度上升”）的一個缺點就是計算量太大了，每一次迭代都需要把所有的數據算一遍，這樣一旦訓練集大了之后，那么計算量將非常大，所以這里后面還提出了隨機梯度下降，思想就是每次只是根據一個data進行修正。這樣得到的最終的結果可能會有所偏差但是速度卻提高了很多，而且優化之后的偏差還是很小的。隨機梯度上升的另一個好處是這是一個在線算法，可以根據新數據的到來不斷處理

函數：

loadDataSet()
創建數據集，這里的數據集就是在一個文件中，這里面有三行，分別是兩個特征和一個標簽，但是我們在讀出的時候還加了X0這個屬性
sigmoid(inX)
sigmoid函數的計算，這個函數長這樣的，基本坐標大點就和階躍函數很像了

gradAscend(dataMatIn, classLabels)
梯度上升算法的實現，里面用到了numpy的數組，並且設定了迭代次數500次，然后為了計算速度都采取了矩陣計算，計算的過程中的公式大概是：w= w+alpha*(y-h)x[i]（一直懶得寫公式，見諒。。。）
gradAscendWithDraw(dataMatIn, classLabels)
上面的函數加強版，增加了一個weight跟着迭代次數的變化曲線
stocGradAscent0(dataMatrix, classLabels)
這里為了加快速度用來隨機梯度上升，即每次根據一組數據調整（額，好吧，這個際沒有隨機因為那是線面那個函數）
stocGradAscentWithDraw0(dataMatrix, classLabels)
上面的函數加強版，增加了一個weight跟着迭代次數的變化曲線
stocGradAscent1(dataMatrix, classLabels, numIter=150)
這就真的開始隨機了，隨機的主要好處是減少了周期性的波動了。另外這里還加入了alpha的值隨迭代變化，這樣可以讓alpha的值不斷的變化，但是都不會減小到0。
stocGradAscentWithDraw1(dataMatrix, classLabels, numIter=150)
上面的函數加強版，增加了一個weight跟着迭代次數的變化曲線
plotBestFit(wei)
根據計算的weight值畫出擬合的線，直觀觀察效果

運行效果分析：

1、梯度上升：

迭代變化趨勢

分類結果：

2、隨機梯度上升版本1

迭代變化趨勢

分類結果：

這個速度雖然快了很多但是效果不太理想啊。不過這個計算量那么少，我們如果把這個迭代200次肯定不一樣了，效果如下

果然好多了

3、隨機梯度上升版本2

迭代變化趨勢

分類結果：

恩，就是這樣啦，效果還是不錯的啦。代碼的畫圖部分寫的有點爛，見諒啦

  1 #coding=utf-8
  2 from numpy import *
  3 
  4 def loadDataSet():
  5     dataMat = []
  6     labelMat = []
  7     fr = open('testSet.txt')
  8     for line in fr.readlines():
  9         lineArr = line.strip().split()
 10         dataMat.append([1.0, float(lineArr[0]), float(lineArr[1])])
 11         labelMat.append(int(lineArr[2]))
 12     return dataMat, labelMat
 13     
 14 def sigmoid(inX):
 15     return 1.0/(1+exp(-inX))
 16     
 17 def gradAscend(dataMatIn, classLabels):
 18     dataMatrix = mat(dataMatIn)
 19     labelMat = mat(classLabels).transpose()
 20     m,n = shape(dataMatrix)
 21     alpha = 0.001
 22     maxCycle = 500
 23     weight = ones((n,1))
 24     for k in range(maxCycle):
 25         h = sigmoid(dataMatrix*weight)
 26         error = labelMat - h
 27         weight += alpha * dataMatrix.transpose() * error
 28         #plotBestFit(weight)
 29     return weight
 30 
 31 def gradAscendWithDraw(dataMatIn, classLabels):
 32     import matplotlib.pyplot as plt
 33     fig = plt.figure()
 34     ax = fig.add_subplot(311,ylabel='x0')
 35     bx = fig.add_subplot(312,ylabel='x1')
 36     cx = fig.add_subplot(313,ylabel='x2')
 37     dataMatrix = mat(dataMatIn)
 38     labelMat = mat(classLabels).transpose()
 39     m,n = shape(dataMatrix)
 40     alpha = 0.001
 41     maxCycle = 500
 42     weight = ones((n,1))
 43     wei1 = []
 44     wei2 = []
 45     wei3 = []
 46     for k in range(maxCycle):
 47         h = sigmoid(dataMatrix*weight)
 48         error = labelMat - h
 49         weight += alpha * dataMatrix.transpose() * error
 50         wei1.extend(weight[0])
 51         wei2.extend(weight[1])
 52         wei3.extend(weight[2])
 53     ax.plot(range(maxCycle), wei1)
 54     bx.plot(range(maxCycle), wei2)
 55     cx.plot(range(maxCycle), wei3)
 56     plt.xlabel('iter_num')
 57     plt.show()
 58     return weight
 59     
 60 def stocGradAscent0(dataMatrix, classLabels):
 61     m,n = shape(dataMatrix)
 62     
 63     alpha = 0.001
 64     weight = ones(n)
 65     for i in range(m):
 66         h = sigmoid(sum(dataMatrix[i]*weight))
 67         error = classLabels[i] - h
 68         weight = weight + alpha * error * dataMatrix[i]
 69     return weight
 70     
 71 def stocGradAscentWithDraw0(dataMatrix, classLabels):
 72     import matplotlib.pyplot as plt
 73     fig = plt.figure()
 74     ax = fig.add_subplot(311,ylabel='x0')
 75     bx = fig.add_subplot(312,ylabel='x1')
 76     cx = fig.add_subplot(313,ylabel='x2')
 77     m,n = shape(dataMatrix)
 78     
 79     alpha = 0.001
 80     weight = ones(n)
 81     wei1 = array([])
 82     wei2 = array([])
 83     wei3 = array([])
 84     numIter = 200
 85     for j in range(numIter):
 86         for i in range(m):
 87             h = sigmoid(sum(dataMatrix[i]*weight))
 88             error = classLabels[i] - h
 89             weight = weight + alpha * error * dataMatrix[i]
 90             wei1 =append(wei1, weight[0])
 91             wei2 =append(wei2, weight[1])
 92             wei3 =append(wei3, weight[2])
 93     ax.plot(array(range(m*numIter)), wei1)
 94     bx.plot(array(range(m*numIter)), wei2)
 95     cx.plot(array(range(m*numIter)), wei3)
 96     plt.xlabel('iter_num')
 97     plt.show()
 98     return weight
 99     
100 def stocGradAscent1(dataMatrix, classLabels, numIter=150):
101     m,n = shape(dataMatrix)
102     
103     #alpha = 0.001
104     weight = ones(n)
105     for j in range(numIter):
106         dataIndex = range(m)
107         for i in range(m):
108             alpha = 4/ (1.0+j+i) +0.01
109             randIndex = int(random.uniform(0,len(dataIndex)))
110             h = sigmoid(sum(dataMatrix[randIndex]*weight))
111             error = classLabels[randIndex] - h
112             weight = weight + alpha * error * dataMatrix[randIndex]
113             del(dataIndex[randIndex])
114     return weight
115     
116 def stocGradAscentWithDraw1(dataMatrix, classLabels, numIter=150):
117     import matplotlib.pyplot as plt
118     fig = plt.figure()
119     ax = fig.add_subplot(311,ylabel='x0')
120     bx = fig.add_subplot(312,ylabel='x1')
121     cx = fig.add_subplot(313,ylabel='x2')
122     m,n = shape(dataMatrix)
123     
124     #alpha = 0.001
125     weight = ones(n)
126     wei1 = array([])
127     wei2 = array([])
128     wei3 = array([])
129     for j in range(numIter):
130         dataIndex = range(m)
131         for i in range(m):
132             alpha = 4/ (1.0+j+i) +0.01
133             randIndex = int(random.uniform(0,len(dataIndex)))
134             h = sigmoid(sum(dataMatrix[randIndex]*weight))
135             error = classLabels[randIndex] - h
136             weight = weight + alpha * error * dataMatrix[randIndex]
137             del(dataIndex[randIndex])
138             wei1 =append(wei1, weight[0])
139             wei2 =append(wei2, weight[1])
140             wei3 =append(wei3, weight[2])
141     ax.plot(array(range(len(wei1))), wei1)
142     bx.plot(array(range(len(wei2))), wei2)
143     cx.plot(array(range(len(wei2))), wei3)
144     plt.xlabel('iter_num')
145     plt.show()
146     return weight
147     
148 def plotBestFit(wei):
149     import matplotlib.pyplot as plt
150     weight = wei
151     dataMat,labelMat = loadDataSet()
152     dataArr = array(dataMat)
153     n = shape(dataArr)[0]
154     xcord1 = []
155     ycord1 = []
156     xcord2 = []
157     ycord2 = []
158     for i in range(n):
159         if int(labelMat[i]) == 1:
160             xcord1.append(dataArr[i,1])
161             ycord1.append(dataArr[i,2])
162         else:
163             xcord2.append(dataArr[i,1])
164             ycord2.append(dataArr[i,2])
165     fig = plt.figure()
166     ax = fig.add_subplot(111)
167     ax.scatter(xcord1, ycord1, s=30, c='red', marker='s')
168     ax.scatter(xcord2, ycord2, s=30, c='green')
169     x = arange(-3.0, 3.0, 0.1)
170     y = (-weight[0] - weight[1]*x)/weight[2]
171     ax.plot(x,y)
172     plt.xlabel('X1')
173     plt.ylabel('X2')
174     plt.show()
175     
176 def main():
177     dataArr,labelMat = loadDataSet()
178     #w = gradAscendWithDraw(dataArr,labelMat)
179     w = stocGradAscentWithDraw0(array(dataArr),labelMat)
180     plotBestFit(w)
181     
182 if __name__ == '__main__':
183     main()

機器學習筆記索引

來自為知筆記(Wiz)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Logistic回歸python實現 Logistic回歸模型和Python實現 logistic回歸 python代碼實現 Logistic回歸python實現小樣例邏輯回歸模型（Logistic Regression）及Python實現在matlab中實現線性回歸和logistic回歸 Logistic回歸 Logistic回歸 Logistic回歸 logistic回歸