網上對於線性回歸的講解已經很多,這里不再對此概念進行重復,本博客是作者在聽吳恩達ML課程時候偶然突發想法,做了兩個小實驗,第一個實驗是采用最小二乘法對數據進行擬合, 第二個實驗是采用梯度下降方法對數據集進行線性擬合,下面上代碼:
最小二乘法:
#!/usr/bin/env python #encoding:UTF-8 import numpy as np import matplotlib.pyplot as plt N=10 X=np.linspace(-3, 3, N) Y=(X+10.0)/2.0 Z=-5.0+X+3.0*Y P=np.ones((N, 1)) P=np.c_[P, X, Y] t=np.linalg.pinv(P) w=np.dot(t, Z) print "擬合后的權重:" print w A=np.dot(P, w)-Z print "擬合后的LOSTFUNCTION值:" print np.dot(A, A)/2
由LOSTFUNCTION值可知所得權重可以使得模型得到較少的損失。
由代碼中給出的擬合權重和代碼運行后得出的擬合權重可以很清楚的發現並不一致,由此很容易得出該數據集對於線性模型存在多重共線性。
梯度下降法:
#!/usr/bin/env python #encoding:UTF-8 import numpy as np import matplotlib.pyplot as plt np.random.seed(0) N=10 X=np.linspace(-3, 3, N) Y=(X+10)/2 Z=-5+X+3*Y P=np.ones((N, 1)) P=np.c_[P, X, Y] alafa=0.001 def fun(): W=np.random.random(3) for _ in xrange(1000000000): A=(np.dot(P, W)-Z) W0=alafa*( np.sum(A) ) W1=alafa*( np.dot(A, X) ) W2=alafa*( np.dot(A, Y) ) if abs(W0)+abs(W1)+abs(W2)<0.000001: break W[0]=W[0]-W0 W[1]=W[1]-W1 W[2]=W[2]-W2 return W list_global=[] for _ in xrange(100): list_global.append( fun() ) list_global.sort(key=lambda x:x[0]) for k in list_global: print k
運行結果:
devil@devilmaycry:/tmp$ python x2.py [ 0.13859798 1.51383611 1.97228226] [ 0.15812746 1.51578866 1.9683764 ] [ 0.15935113 1.51591164 1.96813162] [ 0.16440946 1.51641705 1.96711998] [ 0.20540823 1.52051731 1.9589202 ] [ 0.25569366 1.52554514 1.94886317] [ 0.25880876 1.52585729 1.9482401 ] [ 0.27499425 1.52747539 1.94500304] [ 0.27534702 1.52751115 1.94493244] [ 0.27945902 1.52792177 1.94411009] [ 0.28400393 1.52837623 1.94320111] [ 0.28696386 1.52867268 1.94260909] [ 0.30536768 1.53051274 1.93892835] [ 0.31818944 1.53179514 1.93636398] [ 0.32798936 1.53277487 1.93440402] [ 0.33334942 1.53331098 1.933332 ] [ 0.34418953 1.53439535 1.93116395] [ 0.3463948 1.53461533 1.93072294] [ 0.3567094 1.53564731 1.92865998] [ 0.36449799 1.53642554 1.92710231] [ 0.38157383 1.53813337 1.92368712] [ 0.38437965 1.53841409 1.92312594] [ 0.40486422 1.54046214 1.91902906] [ 0.41061448 1.54103779 1.91787896] [ 0.43215718 1.54319227 1.91357041] [ 0.43668932 1.54364513 1.912664 ] [ 0.45612785 1.54558868 1.90877632] [ 0.45977762 1.54595351 1.90804638] [ 0.464536 1.54643 1.90709465] [ 0.4673361 1.54670955 1.90653467] [ 0.47087532 1.54706372 1.9058268 ] [ 0.47690637 1.54766664 1.90462061] [ 0.47718128 1.54769405 1.90456563] [ 0.48571646 1.54854761 1.90285859] [ 0.49155064 1.5491312 1.90169175] [ 0.50684386 1.55066075 1.89863308] [ 0.53314394 1.55329031 1.8933731 ] [ 0.53925928 1.55390247 1.89214999] [ 0.55049834 1.55502603 1.8899022 ] [ 0.55622356 1.55559837 1.88875717] [ 0.55765135 1.55574137 1.8884716 ] [ 0.57794078 1.55777008 1.88441373] [ 0.57995539 1.55797184 1.88401078] [ 0.59427067 1.55940304 1.88114775] [ 0.64070329 1.56404638 1.87186122] [ 0.66360817 1.56633707 1.86728023] [ 0.66533542 1.56650939 1.86693481] [ 0.67435485 1.56741169 1.8651309 ] [ 0.67461209 1.56743719 1.86507947] [ 0.67841117 1.56781728 1.86431964] [ 0.70004027 1.56998004 1.85999383] [ 0.70249948 1.57022613 1.85950197] [ 0.70332716 1.57030915 1.85933642] [ 0.71017421 1.57099365 1.85796702] [ 0.71407859 1.57138402 1.85718615] [ 0.7277948 1.57275571 1.85444291] [ 0.72946533 1.57292232 1.85410883] [ 0.73050737 1.57302649 1.85390043] [ 0.73111226 1.57308748 1.85377941] [ 0.73420373 1.5733964 1.85316114] [ 0.74375506 1.57435197 1.85125084] [ 0.76167277 1.57614361 1.8476673 ] [ 0.76645953 1.5766221 1.84670997] [ 0.77628556 1.57760457 1.84474477] [ 0.78124114 1.5780999 1.84375367] [ 0.79445023 1.57942128 1.84111182] [ 0.82246882 1.58222329 1.83550809] [ 0.83844516 1.58382098 1.83231282] [ 0.8504276 1.58501882 1.82991636] [ 0.85371404 1.58534781 1.82925904] [ 0.86614329 1.58659024 1.82677323] [ 0.86963538 1.58693955 1.82607481] [ 0.87227961 1.58720441 1.82554593] [ 0.87565432 1.58754174 1.824871 ] [ 0.88307665 1.58828406 1.82338652] [ 0.89447944 1.58942419 1.82110598] [ 0.89722723 1.58969873 1.82055644] [ 0.90304827 1.5902811 1.81939221] [ 0.91298315 1.59127442 1.81740525] [ 0.92325648 1.59230144 1.8153506 ] [ 0.93695625 1.59367157 1.81261064] [ 0.94954532 1.59493075 1.8100928 ] [ 0.96887547 1.59686409 1.80622675] [ 0.97346265 1.59732198 1.80530938] [ 0.98253813 1.59822971 1.80349427] [ 0.98361444 1.59833731 1.80327901] [ 0.98569708 1.59854557 1.80286248] [ 0.99156202 1.59913215 1.80168948] [ 0.99512314 1.59948855 1.80097724] [ 1.05384788 1.60536059 1.78923232] [ 1.05416379 1.60539244 1.78916912] [ 1.07563405 1.60753962 1.78487506] [ 1.08513878 1.60848968 1.78297414] [ 1.09067723 1.60904425 1.7818664 ] [ 1.10849918 1.61082595 1.77830205] [ 1.13617915 1.61359433 1.77276602] [ 1.16159108 1.61613499 1.76768368] [ 1.16994446 1.61697081 1.76601296] [ 1.18300731 1.61827712 1.76340039] [ 1.18811107 1.61878732 1.76237965]
由梯度下降法可知,對於存在多重共線性的數據集合進行線性擬合所得到的權重值並不一致,該權重值會隨着初始權重值的不同而不同,該現象用一句比較學術的話來描述就是對初始狀態敏感。