第三周課程中,邏輯回歸代價函數的求導過程沒有具體展開,在此推導並記錄:
邏輯回歸的代價函數可以統一寫成如下一個等式:
$J(\theta ) = -\frac{1}{m}\left[\sum_{i=1}^{m}y^{(i)}log(h_\theta (x^{(i)}))+(1-y^{(i)})log(1-h_\theta (x^{(i)})) \right]$
其中:$h_\theta (x^{(i)}) = \frac{1}{1+e^{-\theta^\mathrm {T} x}}$
為了避免求導過程太冗長復雜,我們做一些顯示的簡化:
$J(\theta ) = -\frac{1}{m}\left[\sum_{i=1}^{m}K(\theta)\right]$
其中:$K(\theta) = y^{(i)}log(h_\theta (x^{(i)}))+(1-y^{(i)})log(1-h_\theta (x^{(i)}))$
$h_\theta (x^{(i)}) = \frac{1}{1+e^{-\theta^\mathrm {T} x}}$
OK,下面開始我們的推導過程:如果要求$J(\theta)$對某一個參數$\theta$的偏導數,則:
(1)根據求導公式,可以先把常數項$-\frac{1}{m}\sum_{i=1}^{m}$提取出來,這樣就只需要對求和符號內部的表達式求導,即:
$J(\theta ){}' = -\frac{1}{m}\left[\sum_{i=1}^{m}K(\theta){}'\right]$
$K(\theta){}' = \left(ylog(h_\theta (x))+(1-y)log(1-h_\theta (x))\right ){}'$(為方便顯示,先把右上角表示第i個樣本的上標去掉)
(2)根據對數復合求導公式,$log(x){}' = \frac{1}{x}x{}'$,對$K(\theta)$繼續求導可得:
$K(\theta){}' = y\frac{1}{h_\theta (x)}h_\theta (x){}'+(1-y)\frac{1}{1-h_\theta (x)}(1-h_\theta (x)){}'$
(3)根據冪函數復合求導公式,$(y^{x}){}' = xy^{x-1}x{}'$,及以e為底的指數求導公式,對$h_\theta(x)$繼續求導可得:
$h_\theta (x){}' = \left( \frac{1}{1+e^{-\theta^\mathrm {T} x}} \right){}'=-\frac{(1+e^{-\theta^\mathrm {T} x}){}'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = \frac{e^{-\theta^\mathrm {T}x}(\theta^\mathrm {T} x){}'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = \left(\frac{1}{1+e^{-\theta^\mathrm{T}x}}(1-\frac{1}{1+e^{-\theta^\mathrm{T}x}})\right)(\theta^\mathrm{T}x){}' = h_\theta(x)(1-h_\theta(x))(\theta^\mathrm{T}x){}'$
同理,$(1-h_\theta (x)){}'= -\frac{e^{-\theta^\mathrm {T}x}(\theta^\mathrm {T} x){}'}{(1+e^{-\theta^\mathrm {T} x})^{2}} = -h_\theta(x)(1-h_\theta(x))(\theta^\mathrm{T}x){}'$
(4)把步驟3的結果帶入步驟2,化簡后可得:
$K(\theta){}' = (y-h_\theta(x))(\theta^\mathrm{T}x){}'$
再把上面結果帶入步驟1,化簡后可得:
$J(\theta){}' = \frac{1}{m}\left[\sum_{i=1}^{m}(h_\theta(x)-y)(\theta^\mathrm{T}x){}'\right]$
最后$(\theta^\mathrm{T}x){}'$,對第j個$\theta$求偏導,結果即$X_{j}$(j表示樣本中第幾項),得到最終結果:
$\frac{\partial J(\theta)}{\partial \theta_{j}} = \frac{1}{m}\left[\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x_{j}^{(i)}\right]$
