scikit-leanr 庫中的 make_blobs() 函數

本文轉載自查看原文 2019-12-19 16:23 1126 python scikit-learn

sklearn.datasets.make_blobs() 是用於創建多類單標簽數據集的函數，它為每個類分配一個或多個正態分布的點集。

sklearn.datasets.make_blobs(
　　　　　　　　　　n_samples=100, 　　　　　　  # 待生成的樣本的總數
　　　　　　　　　　n_features=2,   　　　　    # 每個樣本的特征數
　　　　　　　　　　centers=3, 　　　　　　　    # 要生成的樣本中心（類別）數，或者是確定的中心點
 　　　　　　　　　 cluster_std=1.0,　　　　    # 每個類別的標准差
 　　　　　　　　　 center_box=(-10.0, 10.0),  #中心確定之后的數據邊界，亦即每個簇的上下限
 　　　　　　　　　 shuffle=True, 　　　　　　　 # 是否將樣本打亂
　　　　　　　　　　random_state=None) 　　　　 #隨機生成器的種子

參數的英文含義：

n_samples: int, optional (default=100)
The total number of points equally divided among clusters.

n_features: int, optional (default=2)
The number of features for each sample.

centers: int or array of shape [n_centers, n_features], optional (default=3)
The number of centers to generate, or the fixed center locations.
 
cluster_std: float or sequence of floats, optional (default=1.0)
The standard deviation of the clusters.
如果生成2類數據，其中一類比另一類具有更大的方差，可以將cluster_std設置為[1.0,3.0]。


center_box: pair of floats (min, max), optional (default=(-10.0, 10.0))
The bounding box for each cluster center when centers are generated at random.


shuffle: boolean, optional (default=True)
Shuffle the samples.


random_state: int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

View Code

返回值

X : array of shape [n_samples, n_features]
The generated samples.
生成的樣本數據集。

y : array of shape [n_samples]
The integer labels for cluster membership of each sample.
樣本數據集的標簽。

示例：

# 導入相關模塊
from  sklearn.datasets import make_blobs import matplotlib.pyplot as plt 
# 創建仿真聚類數據集 X, y = make_blobs(n_samples=150, n_features=2, centers=3, cluster_std=0.5, shuffle=True, random_state=0) 
# 繪制散點圖 plt.figure('百里希文', facecolor='lightyellow') plt.scatter(X[:, 0], X[:, 1], c='w', edgecolor='k', marker='o', s=50) plt.grid() plt.show()

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 sklearn 中 make_blobs模塊使用 make_blobs sklearn學習筆記（1）--make_blobs函數及相應參數簡介【scikit-learn】06：make_blobs聚類數據生成器聚類算法數據生成器make_blobs sklearn.datasets.make_blobs()函數用法機器學習筆記：sklearn.datasets樣本生成器——make_classification、make_blobs、make_regression scikit_learn (sklearn)庫中NearestNeighbors（最近鄰）函數的各參數說明 scikit-learn中自帶的均值方差歸一化函數詳解sklearn中的make_moons函數