Python——sklearn 中 Logistics Regression 的 coef_ 和 intercept_ 的具體意義

本文轉載自查看原文 2020-01-07 16:03 7771 python

sklearn 中 Logistics Regression 的 coef_ 和 intercept_ 的具體意義

使用sklearn庫可以很方便的實現各種基本的機器學習算法，例如今天說的邏輯斯諦回歸（Logistic Regression），我在實現完之后，可能陷入代碼太久，忘記基本的算法原理了，突然想不到coef_和intercept_具體是代表什么意思了，就是具體到公式中的哪個字母，雖然總體知道代表的是模型參數。

正文

我們使用 sklearn 官方的一個例子來作為說明，源碼可以從這里下載，下面我截取其中一小段並做了一些修改：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
	
# 構造一些數據點
centers = [[-5, 0], [0, 1.5], [5, -1]]
X, y = make_blobs(n_samples=1000, centers=centers, random_state=40)
transformation = [[0.4, 0.2], [-0.4, 1.2]]
X = np.dot(X, transformation)
	
clf = LogisticRegression(solver='sag', max_iter=100, random_state=42, multi_class=multi_class).fit(X, y)

print clf.coef_ 
print clf.intercept_

輸出如圖：

可以看到clf.coef_是一個3×2(n_class, n_features)的矩陣，clf.intercept_是一個1×3的矩陣（向量），那么這些到底是什么意思呢？

我們來回顧一下 Logistic 回歸的模型：

\[h_\theta(x) = \frac{1}{1 + e^{(-\theta^Tx)} } \]

其中 $\theta$是模型參數，其實 $\theta^Tx$就是一個線性表達式，將這個表達式的結果再一次利用 Logistic 函數映射到 0~1 之間。

知道了這個，也就可以搞清楚那個clf.coef_和clf.intercept_了： clf.coef_和clf.intercept_就是 $ \theta $，下面我們來驗證一下：

i = 100
print 1 / (1 + np.exp(-(np.dot(X[i].reshape(1, -1), cc.T) + clf.intercept_)))
# 正確的類別
print y[i]
print clf.predict_proba(X[i].reshape(1, -1))
print clf.predict_log_proba(X[i].reshape(1, -1))

輸出結果:

可以看到結果是吻合的，說明我們的猜想是正確的。

原文鏈接：https://blog.csdn.net/u010099080/article/details/52933430!

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 針對sklearn.svm中的"dual_coef_"理解【機器學習】邏輯回歸(logistics regression) python中\r的意義及用法機器學習經典算法具體解釋及Python實現--線性回歸（Linear Regression）算法 Sklearn--(SVR）Regression學習筆記 Python中sklearn中的譜聚類代碼 python中星號的意義（**字典，*列表或元組） Python中的sklearn--KFold與StratifiedKFold sklearn邏輯回歸(Logistic Regression,LR)調參指南關於python中帶下划線的變量和函數的意義