機器學習sklearn（十二）：特征工程（三）特征組合與交叉（一）多項式特征

本文轉載自查看原文 2021-06-19 17:19 225

在機器學習中，通過增加一些輸入數據的非線性特征來增加模型的復雜度通常是有效的。一個簡單通用的辦法是使用多項式特征，這可以獲得特征的更高維度和互相間關系的項。這在 PolynomialFeatures 中實現:

>>> import numpy as np
>>> from sklearn.preprocessing import PolynomialFeatures
>>> X = np.arange(6).reshape(3, 2)
>>> X                                                 
array([[0, 1],
 [2, 3],
 [4, 5]])
>>> poly = PolynomialFeatures(2)
>>> poly.fit_transform(X)                             
array([[  1.,   0.,   1.,   0.,   0.,   1.],
 [  1.,   2.,   3.,   4.,   6.,   9.],
 [  1.,   4.,   5.,  16.,  20.,  25.]])

>>> X = np.arange(9).reshape(3, 3)
>>> X                                                 
array([[0, 1, 2],
 [3, 4, 5],
 [6, 7, 8]])
>>> poly = PolynomialFeatures(degree=3, interaction_only=True)
>>> poly.fit_transform(X)                             
array([[   1.,    0.,    1.,    2.,    0.,    0.,    2.,    0.],
 [   1.,    3.,    4.,    5.,   12.,   15.,   20.,   60.],
 [   1.,    6.,    7.,    8.,   42.,   48.,   56.,  336.]])

注意，當使用多項的 Kernel functions 時，多項式特征被隱式地在核函數中被調用(比如， sklearn.svm.SVC ， sklearn.decomposition.KernelPCA )。

創建並使用多項式特征的嶺回歸實例請見 Polynomial interpolation 。

class sklearn.preprocessing.PolynomialFeatures(degree=2, *, interaction_only=False, include_bias=True, order='C')

Generate polynomial and interaction features.

Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

Parameters

degree int, default=2: The degree of the polynomial features.
interaction_only bool, default=False: If true, only interaction features are produced: features that are products of at most degree distinct input features (so not x[1] ** 2, x[0] * x[2] ** 3, etc.).
include_bias bool, default=True: If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).
order {‘C’, ‘F’}, default=’C’: Order of output array in the dense case. ‘F’ order is faster to compute, but may slow down subsequent estimators.

New in version 0.21.

Attributes

powers_ ndarray of shape (n_output_features, n_input_features): powers_[i, j] is the exponent of the jth input in the ith output.
n_input_features_ int: The total number of input features.
n_output_features_ int: The total number of polynomial output features. The number of output features is computed by iterating over all suitably sized combinations of input features.

Methods

`fit`(X[, y])	Compute number of output features.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_feature_names`([input_features])	Return feature names for output features
`get_params`([deep])	Get parameters for this estimator.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X)	Transform data to polynomial features

Examples

>>> import numpy as np
>>> from sklearn.preprocessing import PolynomialFeatures
>>> X = np.arange(6).reshape(3, 2)
>>> X
array([[0, 1],
       [2, 3],
       [4, 5]])
>>> poly = PolynomialFeatures(2)
>>> poly.fit_transform(X)
array([[ 1.,  0.,  1.,  0.,  0.,  1.],
       [ 1.,  2.,  3.,  4.,  6.,  9.],
       [ 1.,  4.,  5., 16., 20., 25.]])
>>> poly = PolynomialFeatures(interaction_only=True)
>>> poly.fit_transform(X)
array([[ 1.,  0.,  1.,  0.],
       [ 1.,  2.,  3.,  6.],
       [ 1.,  4.,  5., 20.]])

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 機器學習：邏輯回歸（使用多項式特征）機器學習入門-數值特征-進行多項式變化(將特征投影到高維度上) 1.PolynomialFeatures(將數據變化為多項式特征) 特征多項式機器學習之路：python 多項式特征生成PolynomialFeatures 欠擬合與過擬合 Andrew Ng機器學習算法入門((七):特征選擇和多項式回歸機器學習之特征工程機器學習——特征工程 PolynomialFeatures 多項式特征 sklearn.preprocessing.PolynomialFeatures 生成多項式和交互特征特征組合&特征交叉

機器學習sklearn（十二）： 特征工程（三）特征組合與交叉（一）多項式特征

免責聲明！

機器學習sklearn（十二）：特征工程（三）特征組合與交叉（一）多項式特征