pandas数据查找替换,提供以下三种方法:手工一个个替换、replace替换、map映射替换
除此之外,还可创建一个dataframe进行merge或join匹配(同vlookup)
首先创建数据:
import pandas as pd df = pd.DataFrame([["a", 'wait'], ["b", 'doing'], ["c", 'done'], ["d", 'closed'], ["e", 'cancel']], columns=["task","status"]) df
得到以下的数据:
将英文的状态改成中文
1、df手工一个个替换,其他不相关的保持不变
# 状态对应的映射 df.loc[df['status'] == 'wait', 'status'] = '未开始' df.loc[df['status'] == 'doing', 'status'] = '进行中' df.loc[df['status'] == 'done', 'status'] = '已完成' df.loc[df['status'] == 'closed', 'status'] = '已关闭' df.loc[df['status'] == 'cancel', 'status'] = '取消' df.head()
2、replace替换,其他不相关的保持不变
# 实现方式二:用replace替换(2种形式) df_temp02 = df df_temp01['status'] = df['status'].replace(["wait", "doing"], ["未开始", "进行中"]) df_temp02['status'] = df['status'].replace({"wait":"未开始", "doing":"进行中"}) df_temp02.sample(5)
3、map映射替换,如果没有映射的值,会变成NAN
# 实现方式三:通过map映射 # 此种方式的缺点:必须全量对应,如果不是全量匹配不上的就是NAN df_temp03 = df map_series = pd.Series(["未开始", "进行中", "已完成", "已关闭", '取消'], index=["wait", "doing", "done", "closed", "cancel"]) df_temp03['status'] = df_temp03['status'].map(map_series) df_temp03.head() df_temp04 = df map_dict = {"wait":"未开始", "doing":"进行中", 'closed':'已关闭'} df_temp04['status'] = df_temp04['status'].map(map_dict) df_temp04.head()