邏輯回歸：原理及python實現

本文轉載自查看原文 2021-09-28 16:51 196 機器學習

1 邏輯回歸概述
2 邏輯回歸的參數優化及正則化
- 2.1 梯度下降法優化參數
  - 2.1.1 最大似然法確定損失函數（對數損失）
  - 2.1.2 損失函數的優化
- 2.2 正則化
3 邏輯回歸的實現
- 3.1 sklearn實現
- 3.2 python從零實現

邏輯回歸概述

邏輯斯蒂回歸（Logistics Regression,LR）又叫邏輯回歸或對數幾率回歸（Logit Regression），是一種用於二分類的線性模型。

Sigmoid函數

\[g(z) = \frac{1}{1+e^{-z}} \]

其圖像如下，在x=0處函數值為0.5，x趨向於無窮時，函數值分別趨向0和1。

import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

X = np.linspace(-10,10)
y = []
for i in X:
    y.append(1/(1+np.exp(-i)))
plt.plot(X,y)
plt.plot(X,np.ones(len(X))/2,'--',c='black',linewidth='0.8')
plt.xlabel('z')
plt.ylabel('g(z)')
plt.show()

二項邏輯回歸

線性回歸用於解決回歸問題，其輸出$z=wx^T+b$是實數，如何將線性回歸模型加以改造使其可以用於分類呢？一個朴素的想法就是對線性回歸的結果做一個變換

\[y=f(z) \]

使得

\[y\in\{-1,1\} \]

其中

\[{z = X\theta} \]

即把線性回歸所得的函數值映射到$\{-1,1\}$。

邏輯回歸首先使用Sigmoid函數將線性回歸的值映射到區間$[0,1]$，然后將大於0.5（可調整）的值映射為類別1，小於0.5的值映射為類別-1。$g(z)$可以理解為分類為正例的概率，越靠近1，被分類為正例的概率越大，在臨界值0.5處最容易被誤分類。

對數幾率理解

上一節說到，將$g(z)$理解為樣本被分類為正例的概率。一個事件的幾率(odds)是指該事件發生的概率與該事件不發生的概率的比值。則將

\[ln\frac{g(z)}{1-g(z)} \]

稱為對數幾率(log odds)或logit函數，且將

\[{z = X\theta} \]

代入sigmoid函數並變形有

\[ln\frac{g(z)}{1-g(z)}=X\theta \]

因此，邏輯回歸可以看作對對數幾率（Logit）進行的線性回歸，這也是對數幾率回歸名字的由來。

邏輯回歸的參數優化及正則化

上一節已經描述了邏輯回歸的思路，即將線性回歸的結果映射為分類變量，在讀這一節之前，需要了解一下最大似然估計的思想。

梯度下降法優化參數

最大似然法確定損失函數（對數損失）

對於每個樣本點$(x_i,y_i)$，假設$y_i=1,y_i=0$的概率分別為

\[P(y_i=1|x_i,\theta)=g_\theta(x_i) \]

\[P(y_i=0|x_i,\theta)=1-g_\theta(x_i) \]

將其合並為

\[P(y_i|x_i,\theta)=g_\theta(x_i)^{y_i}(1-g_\theta(x_i))^{1-y_i} \]

假設每個樣本點獨立同分布，樣本數為n，由最大似然法（MLE）構造似然函數得

\[L(\theta)=\prod _{i=1}^nP(y_i|x_i,\theta) \]

由於似然函數表示的是取得現有樣本的概率，應當予以最大化，因此，取似然函數的對數的相反數作為損失函數

\[J(\theta) = -lnL(\theta) = -\sum\limits_{i=1}^{m}(y_iln(g_{\theta}(x_i))+ (1-y_i)ln(1-g_{\theta}(x_i))) \]

損失函數的優化

對上一節的損失函數求導可得

\[\frac{\partial J(\theta)}{\partial \theta} = -\sum^m_{i=1}(y_i\frac{\frac{\partial g_{\theta}(x_i)}{\partial \theta}}{g_{\theta}(x_i)}-(1-y_i)\frac{\frac{\partial g_{\theta}(x_i)}{\partial \theta}}{1-g_\theta(x_i)}) \]

對於函數

\[g(z)=\frac 1{1+e^{-z}} \]

有

\[g'(z) = \frac{dg(z)}{dz}=\frac{e^{-z}}{(1+e^{-z})^2}=\frac1{1+e^{-z}}\frac{1+e^{-z}-1}{1+e^{-z}}=g(z)(1-g(z)) \]

當$z=x_i^T\theta$時

\[\frac{\partial g_{\theta}(x_i)}{\partial \theta}=g_{\theta}(x_i)(1-g_{\theta}(x_i))\frac{dz}{d\theta}=g_{\theta}(x_i)(1-g_{\theta}(x_i))x_i \]

故

\[\frac{\partial J(\theta)}{\partial \theta}=\sum_{i=1}^m x_i(g_\theta(x_i)-y_i)=X(g_\theta(X)-y) \]

使用梯度下降法$$\theta = \theta - \alpha X^T(g_{\theta}(X) - y )$$

其中

\[X=\begin{bmatrix} x_1 \\ x_2 \\...\\x_m\end{bmatrix},y=\begin{bmatrix} y_1 \\ y_2 \\...\\y_m\end{bmatrix} \]

正則化

Logistic Regression也可以使用正則化，方法同樣是在損失函數后增加正則化項。

\[J(\mathbf\theta) = -\sum\limits_{i=1}^{m}[y_iln(h_{\theta}(x_i))+ (1-y_i)ln(1-h_{\theta}(x_i))] + \frac{1}{2}C||\theta||_2^2+\alpha||\theta||_1 \]

邏輯回歸的實現

sklearn實現

sklearn中LogisticRegression默認使用L2正則化，參數penalty可修改正則化方式。下面是使用sklearn自帶的乳腺癌數據集進行邏輯回歸訓練的代碼，下圖是不同正則化參數訓練所得模型系數，可以看出skleran中正則化項C越小，正則化程度越強，參數的變換范圍越小。sklearn中的C應該是上式中的C的相反數。

# 乳腺癌數據上使用Logistic Regression
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

cancer = load_breast_cancer()
X_train,X_test,y_train,y_test = train_test_split(cancer.data,cancer.target,stratify = cancer.target,random_state=42)

for C,maker in zip([0.001,1,100],['o','^','v']):
    logistic = LogisticRegression(C = C,penalty='l2',max_iter=100).fit(X_train,y_train)
    print('訓練精度（C={})：{}'.format(C,logistic.score(X_train,y_train)))
    print('測試精度（C={})：{}'.format(C,logistic.score(X_test,y_test)))
    plt.plot(logistic.coef_.T,maker,label = 'C={}'.format(C))
plt.xticks(range(cancer.data.shape[1]),cancer.feature_names,rotation = 90)
plt.xlabel('Coefficient Index')
plt.ylabel('Coefficient')
plt.legend()
plt.show()

訓練精度（C=0.001)：0.9530516431924883
測試精度（C=0.001)：0.9440559440559441
訓練精度（C=1)：0.9460093896713615
測試精度（C=1)：0.958041958041958
訓練精度（C=100)：0.9413145539906104
測試精度（C=100)：0.965034965034965

python從零實現

下面是python從零實現的代碼。同樣的，先定義了一個LogistcReg類，並初始化參數，在fit過程中，未使用pytorch的自動求導而是直接利用上面推導出的公式來進行梯度下降。在類中還定義了，概率、評分、預測等方法用於輸出相關數據。

import numpy as np
import random

import pandas as pd
from sklearn.linear_model import LogisticRegression


class LogisticReg:
    def __init__(self, X, y, batch=10, learning_rate=0.01, epoch=3, threshhold_value=0.5,random_seeds=50):
        self.random_seeds = random_seeds
        self.features = np.insert(X,0,values = np.ones(X.shape[0]),axis=1)
        self.labels = y
        self.batch = batch
        self.learning_rate = learning_rate
        self.epoch = epoch
        self.theta = np.random.normal(0, 0.01, size=(self.features.shape[1],1))
        self.threshhold_value = threshhold_value
        
        random.seed(self.random_seeds)
        
    def sigmoid(self, features):
        return 1/(1+np.exp(-np.dot(features, self.theta)))

    def data_iter(self):
        range_list = np.arange(self.features.shape[0])
        random.shuffle(range_list)
        for i in range(0, len(range_list),self.batch):
            batch_indices = range_list[i:min(i+self.batch, len(range_list))]
            yield self.features[batch_indices], self.labels[batch_indices].reshape(-1,1)

    def fit(self):
        for i in range(self.epoch):
            for batch_features, batch_labels in self.data_iter():
                self.theta -= self.learning_rate * np.dot(np.mat(batch_features).T,
                                                       self.sigmoid(batch_features) - batch_labels)

    def pred_prob(self, pre_X):
        pre_X = np.insert(pre_X,0,np.ones(pre_X.shape[0]),axis=1)
        return self.sigmoid(pre_X)

    def predict(self, pre_X):
        return self.pred_prob(pre_X) >= self.threshhold_value

    def score(self, pre_y, true_y):
        return sum(pre_y.flatten() == true_y.flatten())/len(pre_y)

    def param(self):
        return self.theta.flatten()


def main():
    # 導入數據
    data = pd.read_excel('../bankloan.xls')
    display(data.head(3))
    X = data.iloc[:, :-1].values
    y = data.iloc[:, -1].values
    
    logit = LogisticReg(X, y)
    logit.fit()
    y_pre = logit.predict(X)
    print(f'前五行正例概率：\n{logit.pred_prob(X)[:5]}')
    print(f'准確率：{logit.score(y_pre,y)}')
    print(f'參數向量：{logit.param()}')

    skl_LogReg = LogisticRegression(max_iter=1000).fit(X, y)
    print(f'sklearn准確率：{skl_LogReg.score(X, y)}')


if __name__ == '__main__':
    main()

	年齡	教育	工齡	地址	收入	負債率	信用卡負債	其他負債	違約
0	41	3	17	12	176	9.3	11.359392	5.008608	1
1	27	1	10	6	31	17.3	1.362202	4.000798	0
2	40	1	15	14	55	5.5	0.856075	2.168925	0

前五行正例概率：
[[2.71836750e-19]
 [2.88481260e-09]
 [3.29344356e-61]
 [7.11139973e-53]
 [1.00000000e+00]]
准確率：0.8042857142857143
參數向量：[-0.08272148 -1.48932592  0.15983636 -6.65078674 -2.42456947  0.43262372
  4.64195568  3.20970401  0.85901102]
sklearn准確率：0.8085714285714286

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python實現邏輯回歸 Python實現LR(邏輯回歸) 邏輯回歸算法的原理及實現(LR) 邏輯回歸原理小結邏輯回歸原理推導邏輯斯諦回歸(部分python代碼實現）機器學習作業（二）邏輯回歸——Python(numpy)實現 Python學習筆記之邏輯回歸 Python之邏輯回歸模型來預測機器學習--邏輯回歸模型原理