數據可視化基礎專題(十七):Pandas120題(二):1-20


1-20

import pandas as pd
import numpy as np

1.將下面的字典創建為DataFrame

data = {"grammer":["Python","C","Java","GO",np.nan,"SQL","PHP","Python"],
       "score":[1,2,np.nan,4,5,6,7,10]}
df = pd.DataFrame(data)
df

2.提取含有字符串"Python"的行

#方法一
df[df['grammer'] == 'Python']
#方法二
results = df['grammer'].str.contains("Python")
results.fillna(value=False,inplace = True)
df[results]

 

3.輸出df的所有列名

print(df.columns

4.修改第二列列名為'popularity'

df.rename(columns={'score':'popularity'}, inplace = True)
df

 

 

5.統計grammer列中每種編程語言出現的次數

df['grammer'].value_counts()

6.將空值用上下值的平均值填充

df['popularity'] = df['popularity'].fillna(df['popularity'].interpolate())
df

 

 

7.提取popularity列中值大於3的行

df[df['popularity'] > 3]

8.按照grammer列進行去除重復值

df.drop_duplicates(['grammer'])

 

 

9.計算popularity列平均值

df['popularity'].mean()

 

 

10.將grammer列轉換為list

df['grammer'].to_list()

 

 

11.將DataFrame保存為EXCEL

df.to_excel('test.xlsx')

12.查看數據行列數

df.shape

 

 

13.提取popularity列值大於3小於7的行

df[(df['popularity'] > 3) & (df['popularity'] < 7)]

14.交換兩列位置

'''
方法1
'''
temp = df['popularity']
df.drop(labels=['popularity'], axis=1,inplace = True)
df.insert(0, 'popularity', temp)
df
'''
方法2
cols = df.columns[[1,0]]
df = df[cols]
df
'''

15.提取popularity列最大值所在行

df[df['popularity'] == df['popularity'].max()]

 

 

16.查看最后5行數據

df.tail()

 

 

17.刪除最后一行數據

df.drop([len(df)-1],inplace=True)
df

 

 

18.添加一行數據['Perl',6.6]

row={'grammer':'Perl','popularity':6.6}
df = df.append(row,ignore_index=True)
df

19.對數據按照"popularity"列值的大小進行排序

df.sort_values("popularity",inplace=True)
df

 

 

20.統計grammer列每個字符串的長度

df['grammer'] = df['grammer'].fillna('R')
df['len_str'] = df['grammer'].map(lambda x: len(x))
df

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM