Python pandas DataFrame操作


1. 從字典創建Dataframe

>>> import pandas as pd
>>> dict1 = {'col1':[1,2,5,7],'col2':['a','b','c','d']}
>>> df = pd.DataFrame(dict1)
>>> df
   col1 col2
0     1    a
1     2    b
2     5    c
3     7    d

 

2. 從列表創建Dataframe (先把列表轉化為字典,再把字典轉化為DataFrame)

>>> lista = [1,2,5,7]
>>> listb = ['a','b','c','d']
>>> df = pd.DataFrame({'col1':lista,'col2':listb})
>>> df
   col1 col2
0     1    a
1     2    b
2     5    c
3     7    d

 

3. 從列表創建DataFrame,指定data和columns

>>> a = ['001','zhangsan','M']
>>> b = ['002','lisi','F']
>>> c = ['003','wangwu','M']
>>> df = pandas.DataFrame(data=[a,b,c],columns=['id','name','sex'])
>>> df
    id      name sex
0  001  zhangsan   M
1  002      lisi   F
2  003    wangwu   M

 

4. 修改列名,從['id','name','sex']修改為['Id','Name','Sex']

>>> df.columns = ['Id','Name','Sex']
>>> df
    Id      Name Sex
0  001  zhangsan   M
1  002      lisi   F
2  003    wangwu   M

 

5. 調整DataFrame列順序、調整列編號從1開始

http://www.cnblogs.com/huahuayu/p/8324755.html 

 

6. DataFrame隨機生成10行4列int型數據

>>> import pandas
>>> import numpy
>>> df = pandas.DataFrame(numpy.random.randint(0,100,size=(10, 4)), columns=list('ABCD')) # 0,100指定隨機數為0到100之間(包括0,不包括100),size = (10,4)指定數據為10行4列,column指定列名
>>> df
    A   B   C   D
0  67  28  37  66
1  21  27  43  37
2  73  54  98  85
3  40  78   4  93
4  99  60  63  16
5  48  46  24  61
6  59  52  62  28
7  20  74  36  64
8  14  13  46  60
9  18  44  70  36

 

7. 用時間序列做index名

>>> df # 原本index為自動生成的0~9
    A   B   C   D
0  31  25  45  67
1  62  12  61  88
2  79  36  20  97
3  26  57  50  44
4  24  12  50   1
5   4  61  99  62
6  40  47  52  27
7  83  66  71   4
8  58  59  25  62
9  38  81  60   8
>>> import pandas
>>> dates = pandas.date_range('20180121',periods=10)
>>> dates # 從20180121開始,共10天
DatetimeIndex(['2018-01-21', '2018-01-22', '2018-01-23', '2018-01-24',
               '2018-01-25', '2018-01-26', '2018-01-27', '2018-01-28',
               '2018-01-29', '2018-01-30'],
              dtype='datetime64[ns]', freq='D')
>>> df.index = dates # 將dates賦值給index
>>> df
             A   B   C   D
2018-01-21  31  25  45  67
2018-01-22  62  12  61  88
2018-01-23  79  36  20  97
2018-01-24  26  57  50  44
2018-01-25  24  12  50   1
2018-01-26   4  61  99  62
2018-01-27  40  47  52  27
2018-01-28  83  66  71   4
2018-01-29  58  59  25  62
2018-01-30  38  81  60   8

 

8. dataframe 實現類SQL操作

pandas官方文檔 Comparison with SQL

https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html

【Python實戰】Pandas:讓你像寫SQL一樣做數據分析(一)

https://www.cnblogs.com/en-heng/category/778194.html 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM