qcut是根据这些值的频率来选择箱子的均匀间隔
每个箱子中含有的数的数量是相同的
pd.qcut(factors, 3).value_counts() #计算每个分组中含有的数的数量
(-2.113, -0.158] 3
(-0.158, 1.525] 3
(1.525, 2.154] 3
参数retbins
pd.qcut(factors, 3,retbins=True)# 返回每个数对应的分组,且额外返回bins,即每个边界值
pd.cut()
cut将根据值本身来选择箱子均匀间隔,即每个箱子的间距都是相同的
pd.cut(factors, 3) #返回每个数对应的分组 [(0.732, 2.154], (-0.69, 0.732], (0.732, 2.154], (-0.69, 0.732], (-2.117, -0.69], (0.732, 2.154], (-0.69, 0.732], (-0.69, 0.732], (0.732, 2.154]] Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]] pd.cut(factors, bins=[-3,-2,-1,0,1,2,3]) [(2, 3], (0, 1], (1, 2], (-1, 0], (-3, -2], (2, 3], (-1, 0], (0, 1], (1, 2]] Categories (6, interval[int64]): [(-3, -2] < (-2, -1] < (-1, 0] < (0, 1] (1, 2] < (2, 3]] pd.cut(factors, 3).value_counts() #计算每个分组中含有的数的数量 Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]] (-2.117, -0.69] 1 (-0.69, 0.732] 4 (0.732, 2.154] 4
分享来自 :https://blog.csdn.net/starter_____/article/details/79327997