一、分組大小和分組排序

　　可通過GroupBy對象的size()方法，知道每個分組的樣本數；

>>> df.groupby(['class']).size()
class
A    3
B    4
C    2
dtype: int64

>>> df.groupby(['class','sex']).size()
class  sex
A      female    1
       male      2
B      female    2
       male      2
C      male      2
dtype: int64

　　在默認情況下，分組聚合后的索引會進行排序，可能會降低運行速度，所以，在數據量很大的時候可以設置sort = False，指定不進行排序以提高分組速度；

>>> df.groupby('class',sort = False).mean()
       score_math  score_music
class
A            93.0        85.00
B            86.5        79.75
C            76.0        90.50

二、對分組進行迭代

　　GroupBy對象是一個可迭代對象，所以可以通過迭代獲取分組名和數據；

例如：獲取班級的分組名和數據（若按多個鍵進行分組，則分組名變成元組）

>>> grouped = df.groupby('class')
>>> for name,group in grouped:
...     print(name)
...     print(group)
...     print('-'*40)

A
  class     sex  score_math  score_music
0     A    male          95           79
1     A  female          96           90
7     A    male          88           86
----------------------------------------
B
  class     sex  score_math  score_music
2     B  female          85           85
4     B  female          84           90
5     B    male          88           70
8     B    male          89           74
----------------------------------------
C
  class   sex  score_math  score_music
3     C  male          93           92
6     C  male          59           89
----------------------------------------

三、選擇指定組或指定的列

　　（1）將分組名及其數據封裝成一個字典，便於后序選擇指定組的數據；

　　值得注意的是：不可直接將GroupBy對象打包成字典，必須先將其轉化成包含多個元組的列表，才能使用dict()將其轉換成字典。

>>> grouped = df.groupby('class')
>>> pieces = dict(list(grouped))

>>> len(pieces)
3

>>> pieces.keys()
dict_keys(['A', 'B', 'C'])

>>> pieces['A']
  class     sex  score_math  score_music
0     A    male          95           79
1     A  female          96           90
7     A    male          88           86

　　（2）GroupBy對象的get_group()也可以達到同樣的效果而且更直觀

>>> grouped.get_group('A')
  class     sex  score_math  score_music
0     A    male          95           79
1     A  female          96           90
7     A    male          88           86

　　（3）若只需要對指定的列進行GroupBy操作，只需在groupby()后加上指定的列即可

>>> df.groupby('class')['score_math'].mean()
class
A    93.0
B    86.5
C    76.0

>>> df.groupby('class')['score_math','score_music'].mean()
       score_math  score_music
class
A            93.0        85.00
B            86.5        79.75
C            76.0        90.50

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 數組分組（DP） js 數組分組 python 元組分組並排序 php數組分組問題 js對象數組分組 5-Pandas數據分組與聚合（df.Groupby()） Python pandas 分組匯總(group by)指定列的數據方法及示例代碼 java 連續數字數組分組 IOS 數組分組 Grouped NSArray js對數組分組處理