一 set_index()函數
1 主要是理解drop和append參數,注意與reset_index()參數的不同.

import pandas as pd df = pd.DataFrame({'a': range(4), 'b': range(4, 0, -1), 'c': ['one', 'one', 'two', 'two'], 'd': ['a','b','c','d']}) print(df) # a b c d # 0 0 4 one a # 1 1 3 one b # 2 2 2 two c # 3 3 1 two d # set_index()的drop參數默認為True,如下即默認將普通列c列置為索引列后,將原先的普通列c列刪除. # 注意它與reset_index()的drop不同,reset_index()中的drop默認為False,且這個drop為True時,刪除的是原先的index列 df.set_index(['c'], inplace=True) print(df) # a b d # c # one 0 4 a # one 1 3 b # two 2 2 c # two 3 1 d # append參數為True,會保留原先的索引,為False時,新設置的索引會覆蓋原先的索引,它類似與reset_index()中的drop. df.set_index(['b'], inplace=True, append=True) print(df) # a d # c b # one 4 0 a # 3 1 b # two 2 2 c # 1 3 d
二 reset_index()函數
1 重置索引后,drop參數默認為False,想要刪除原先的索引列要置為True.想要在原數據上修改要inplace=True.特別是不賦值的情況必須要加,否則drop無效.
all_user_repay.reset_index(drop=True,inplace=True)

df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1'], 'C': ['C0', 'C1'], 'D': ['D0', 'D1']}) df2 = pd.DataFrame({'A': ['A4', 'A5'], 'B': ['B4', 'B5'], 'C': ['C4', 'C5'], 'D': ['D4', 'D5']}) frames = [df1, df2] result = pd.concat(frames) print(result.reset_index()) # index A B C D # 0 0 A0 B0 C0 D0 # 1 1 A1 B1 C1 D1 # 2 0 A4 B4 C4 D4 # 3 1 A5 B5 C5 D5 print(result.reset_index(drop=True)) # A B C D # 0 A0 B0 C0 D0 # 1 A1 B1 C1 D1 # 2 A4 B4 C4 D4 # 3 A5 B5 C5 D5
Series.reset_index()
注意參數level默認移除原先的全部索引,即將原先的全部索引都置為普通列.
如果給level賦值,則只有所賦值的索引列置為普通列,其余的留下做索引列.
參考:http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reset_index.html?highlight=reset_index#pandas.Series.reset_index

arrays = [np.array(['bar', 'bar', 'baz', 'baz']), np.array(['one', 'two', 'one', 'two'])] s2 = pd.Series( range(4), name='foo', index=pd.MultiIndex.from_arrays(arrays, names=['a', 'b'])) print(s2) #這里如果想要保留修改不能用inplace參數,只能再賦給另一個引用 print(s2.reset_index(level='a')) print(s2.reset_index()) print(type(s2)) # a b # bar one 0 # two 1 # baz one 2 # two 3 # Name: foo, dtype: int64 # a foo # b # one bar 0 # two bar 1 # one baz 2 # two baz 3 # a b foo # 0 bar one 0 # 1 bar two 1 # 2 baz one 2 # 3 baz two 3 # <class 'pandas.core.series.Series'>
2 把某一列設為索引列
df.set_index('列名',inplace=True)