import pandas as pd
import numpy as np
help(pd.DataFrame.iterrows)
Help on function iterrows in module pandas.core.frame:
iterrows(self)
Iterate over DataFrame rows as (index, Series) pairs.
Notes
-----
1. Because ``iterrows`` returns a Series for each row,
it does **not** preserve dtypes across the rows (dtypes are
preserved across columns for DataFrames). For example,
>>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])
>>> row = next(df.iterrows())[1]
>>> row
int 1.0
float 1.5
Name: 0, dtype: float64
>>> print(row['int'].dtype)
float64
>>> print(df['int'].dtype)
int64
To preserve dtypes while iterating over the rows, it is better
to use :meth:`itertuples` which returns namedtuples of the values
and which is generally faster than ``iterrows``.
2. You should **never modify** something you are iterating over.
This is not guaranteed to work in all cases. Depending on the
data types, the iterator returns a copy and not a view, and writing
to it will have no effect.
Returns
-------
it : generator
A generator that iterates over the rows of the frame.
See also
--------
itertuples : Iterate over DataFrame rows as namedtuples of the values.
iteritems : Iterate over (column name, Series) pairs.
運用iterrows()
返回的index和row,其中index是行索引,row是包含改行信息的Series的迭代器。運用這個方法,可以一行一行的增加特殊要求的列(前提是首先初始化該特殊要求的列)
xx=np.random.randint(9,size=(6,3))
tests=pd.DataFrame(xx,columns=['one','two','three']);tests
one | two | three | |
---|---|---|---|
0 | 3 | 0 | 4 |
1 | 1 | 0 | 3 |
2 | 1 | 4 | 4 |
3 | 7 | 3 | 2 |
4 | 7 | 5 | 0 |
5 | 5 | 8 | 8 |
現在我們想加上一列,這一列的要求如下:如果同行的'one'+'two'+'three'是奇數,則寫上奇數,如果是偶數,則寫上偶數。
tests['special']='ini'
for index,row in tests.iterrows():
num=(row.values[:-1]).sum()
if num%2 :
row['special']='奇數'
else:
row['special']='偶數'
tests.loc[index]=row #將Series 迭代器row賦值給tests的index行
tests
one | two | three | special | |
---|---|---|---|---|
0 | 3 | 0 | 4 | 奇數 |
1 | 1 | 0 | 3 | 偶數 |
2 | 1 | 4 | 4 | 奇數 |
3 | 7 | 3 | 2 | 偶數 |
4 | 7 | 5 | 0 | 偶數 |
5 | 5 | 8 | 8 | 奇數 |