python對全班成績進行數據清洗(pandas的使用)


對於給定的數據集,進行適當的數據清洗

import pandas as pd
data = {'Chinese': [66, 95, 93, 90, 80, 80], 'English': [65, 85, 92, 88, 90, 90],
'Math': [None, 98, 96, 77, 90, 90]}
df = pd.DataFrame(data, index=['zhangfei', 'guanyu', 'zhaoyun', 'huangzhong', 'dianwei', 'dianwei'],
columns=['English', 'Math', 'Chinese'])
print('構建的數據:\n',df)
#數據清洗
#刪除不必要的行
df = df.drop(index=['guanyu'])
print('刪除后的新數據:\n',df)
#去重
df = df.drop_duplicates()
print('去重后的新數據:\n',df)
#更改數據格式
df['Math'].astype('str')
#列名重命名
print('檢查哪列存在空值:\n',df.isnull().any())
#重命名
df.rename(columns={'English':'yingyu','Math':'shuxue','Chinese':'yuwen'},inplace=True)
print('重命名后的數據:\n',df)
df['sum1'] = df['yingyu']+df['shuxue']+df['yuwen']
print('增加一列總成績:\n',df)

結果:

構建的數據:
             English  Math  Chinese
zhangfei         65   NaN       66
guanyu           85  98.0       95
zhaoyun          92  96.0       93
huangzhong       88  77.0       90
dianwei          90  90.0       80
dianwei          90  90.0       80
刪除后的新數據:
             English  Math  Chinese
zhangfei         65   NaN       66
zhaoyun          92  96.0       93
huangzhong       88  77.0       90
dianwei          90  90.0       80
dianwei          90  90.0       80
去重后的新數據:
             English  Math  Chinese
zhangfei         65   NaN       66
zhaoyun          92  96.0       93
huangzhong       88  77.0       90
dianwei          90  90.0       80
檢查哪列存在空值:
 English    False
Math        True
Chinese    False
dtype: bool
重命名后的數據:
             yingyu  shuxue  yuwen
zhangfei        65     NaN     66
zhaoyun         92    96.0     93
huangzhong      88    77.0     90
dianwei         90    90.0     80
增加一列總成績:
             yingyu  shuxue  yuwen   sum1
zhangfei        65     NaN     66    NaN
zhaoyun         92    96.0     93  281.0
huangzhong      88    77.0     90  255.0
dianwei         90    90.0     80  260.0

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM