1、條件查詢:
result = df.query("((a==1 and b=="x") or c/d < 3))" print result
2、遍歷
a)根據索引遍歷
for idx in df.index: dd = df.loc[idx] print(dd)
b)按行遍歷
for i in range(0, len(df)): dd = df.iloc[i] print(dd)
3、對某列求均值
# 對“volume”列求均值 result = df["volume"].mean() print(result)
4、按照指定列排序
result_df = df.sort_values(by="sales" , ascending=False) print(result_df)
注意,以上排序,非inplace
5、提取特定行/列
如有數據:
code update_time last_price open_price ... option_gamma option_vega option_theta option_rho 42 HK.02018 2019-04-26 16:08:05 53.70 52.70 ... NaN NaN NaN NaN 15 HK.00151 2019-04-26 16:08:33 6.17 6.21 ... NaN NaN NaN NaN 14 HK.00101 2019-04-26 16:08:05 18.22 18.26 ... NaN NaN NaN NaN
a)按照索引提取
提取索引為42的行和所有列:
result = df.loc[42, :] print(result)
result:
code update_time last_price open_price ... option_gamma option_vega option_theta option_rho 42 HK.02018 2019-04-26 16:08:05 53.70 52.70 ... NaN NaN NaN NaN
提取索引為15,42的數據, 只需要code和update_time兩列:
result = df.loc[[15,42], [0,2]] print(result)
result:
code update_time 42 HK.02018 2019-04-26 16:08:05 15 HK.00151 2019-04-26 16:08:33
b)按行提取
提取第2行的數據, 所有列:
result = df.iloc[1, :] print(result)
result:
code update_time last_price open_price ... option_gamma option_vega option_theta option_rho 15 HK.00151 2019-04-26 16:08:33 6.17 6.21 ... NaN NaN NaN NaN
提取前2行的數據, 所有列:
result = df.iloc[0:2, :] print(result)
result:
code update_time last_price open_price ... option_gamma option_vega option_theta option_rho 42 HK.02018 2019-04-26 16:08:05 53.70 52.70 ... NaN NaN NaN NaN 15 HK.00151 2019-04-26 16:08:33 6.17 6.21 ... NaN NaN NaN NaN
提取1、3行的數據, 只需要code和update_time兩列:
result = df.iloc[[0,2], 0:2] print(result)
result:
code update_time 42 HK.02018 2019-04-26 16:08:05 14 HK.00101 2019-04-26 16:08:05
6、復制列
df['col']=df['col1']+df['col2']
將col1和col2相除的結果加1,放入新的newcol列:
df['newcol']=df['col1']/df['col2']+1
7、重命名列
new_df = df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'}) print(new_df) # inplace模式 df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'}, inplace=True) print(df)