pd.qcut() 和 pd.cut()


qcut是根據這些值的頻率來選擇箱子的均勻間隔

每個箱子中含有的數的數量是相同的
pd.qcut(factors, 3).value_counts() #計算每個分組中含有的數的數量
(-2.113, -0.158]    3
(-0.158, 1.525]     3
(1.525, 2.154]      3

參數retbins
pd.qcut(factors, 3,retbins=True)# 返回每個數對應的分組,且額外返回bins,即每個邊界值
pd.cut()

cut將根據值本身來選擇箱子均勻間隔,即每個箱子的間距都是相同的
pd.cut(factors, 3) #返回每個數對應的分組
[(0.732, 2.154], (-0.69, 0.732], (0.732, 2.154], (-0.69, 0.732], (-2.117, -0.69], (0.732, 2.154], (-0.69, 0.732], (-0.69, 0.732], (0.732, 2.154]]
Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]]

pd.cut(factors, bins=[-3,-2,-1,0,1,2,3])
[(2, 3], (0, 1], (1, 2], (-1, 0], (-3, -2], (2, 3], (-1, 0], (0, 1], (1, 2]]
Categories (6, interval[int64]): [(-3, -2] < (-2, -1] < (-1, 0] < (0, 1] (1, 2] < (2, 3]]

pd.cut(factors, 3).value_counts() #計算每個分組中含有的數的數量
Categories (3, interval[float64]): [(-2.117, -0.69] < (-0.69, 0.732] < (0.732, 2.154]]
(-2.117, -0.69]    1
(-0.69, 0.732]     4
(0.732, 2.154]     4

分享來自  :https://blog.csdn.net/starter_____/article/details/79327997


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM