Series索引的工作方式類似於NumPy數組的索引,不過Series的索引值不只是整數,如:
import numpy as np import pandas as pd from pandas import Series,DataFrame obj=Series(np.arange(4),index=['a','b','c','d'])
obj=Series(np.arange(4),index=['a','b','c','d']) obj Out[10]: a 0 b 1 c 2 d 3 dtype: int32
obj['b'] Out[11]: 1 obj[1] Out[12]: 1 obj[2:4] Out[13]: c 2 d 3 dtype: int32 obj[['b','a','d']] Out[14]: b 1 a 0 d 3 dtype: int32 obj[[1,3]] Out[15]: b 1 d 3 dtype: int32
obj[obj<2]
Out[17]:
a 0
b 1
dtype: int32
#利用標簽索引與普通的Python切片運算不同 #因為末端是包含的 obj['b':'c']=5 obj Out[24]: a 0 b 5 c 5 d 3 dtype: int32
DataFrame 進行索引其實就是獲取一個或者多個列:
獲取列:指定列名稱即可
data=DataFrame(np.arange(16).reshape((4,4)),index=['Ohio','Colorado','Utah','New York'],columns=['one','two','three','four']) data Out[26]: one two three four Ohio 0 1 2 3 Colorado 4 5 6 7 Utah 8 9 10 11 New York 12 13 14 15 data['two'] Out[27]: Ohio 1 Colorado 5 Utah 9 New York 13 Name: two, dtype: int32 data[['three','one']] Out[28]: three one Ohio 2 0 Colorado 6 4 Utah 10 8 New York 14 12
獲取行:
(1)通過切片或布爾型數組;
(2)通過布爾型DataFrame進行索引;
(3)在行上標簽索引,引入索引字段ix,它可以通過NumPy式的標記法及軸標簽從DataFrame中選取行和列的子集。
#切片獲取行 data[:2] Out[29]: one two three four Ohio 0 1 2 3 Colorado 4 5 6 7 #布爾型數組獲取行 data[data['three']>5] Out[30]: one two three four Colorado 4 5 6 7 Utah 8 9 10 11 New York 12 13 14 15 #布爾型DataFrame進行索引 data<5 Out[31]: one two three four Ohio True True True True Colorado True False False False Utah False False False False New York False False False False #將data<5的數值賦值為0 data[data<5]=0 data Out[33]: one two three four Ohio 0 0 0 0 Colorado 0 5 6 7 Utah 8 9 10 11 New York 12 13 14 15 #行上進行標簽索引,使用索引字段ix data.ix['Colorado',['two','three']] Out[34]: two 5 three 6 Name: Colorado, dtype: int32 data.ix[['Colorado','Utah'],[3,0,1]] Out[35]: four one two Colorado 7 0 5 Utah 11 8 9 #索引的是行索引號為2的數據,也就是行Utah data.ix[2] Out[36]: one 8 two 9 three 10 four 11 Name: Utah, dtype: int32 data.ix[:'Utah','two'] Out[37]: Ohio 0 Colorado 5 Utah 9 Name: two, dtype: int32 #索引data.three>5的行 data.ix[data.three>5,:3] Out[38]: one two three Colorado 0 5 6 Utah 8 9 10 New York 12 13 14
DataFrame的索引選項
#選取DataFrame的單個列或者一組列 obj[val] #選取的單個行或者一組行 obj.ix[val] #選取單個列或列的子集 obj.ix[:,val] #同時選取行和列 obj.ix[val1,val2]