qcut是根據這些值的頻率來選擇箱子的均勻間隔
每個箱子中含有的數的數量是相同的
pd.qcut(factors, 3).value_counts() #計算每個分組中含有的數的數量
(-2.113, -0.158] 3
(-0.158, 1.525] 3
(1.525, 2.154] 3
參數retbins
pd.qcut(factors, 3,retbins=True)# 返回每個數對應的分組,且額外返回bins,即每個邊界值
pd.cut()
cut將根據值本身來選擇箱子均勻間隔,即每個箱子的間距都是相同的
pd.cut(factors, 3) #返回每個數對應的分組 [(0.732, 2.154], (-0.69, 0.732], (0.732, 2.154], (-0.69, 0.732], (-2.117, -0.69], (0.732, 2.154], (-0.69, 0.732], (-0.69, 0.732], (0.732, 2.154]] Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]] pd.cut(factors, bins=[-3,-2,-1,0,1,2,3]) [(2, 3], (0, 1], (1, 2], (-1, 0], (-3, -2], (2, 3], (-1, 0], (0, 1], (1, 2]] Categories (6, interval[int64]): [(-3, -2] < (-2, -1] < (-1, 0] < (0, 1] (1, 2] < (2, 3]] pd.cut(factors, 3).value_counts() #計算每個分組中含有的數的數量 Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]] (-2.117, -0.69] 1 (-0.69, 0.732] 4 (0.732, 2.154] 4
分享來自 :https://blog.csdn.net/starter_____/article/details/79327997