1 object数据类型是dataframe中特殊的数据类型,当某一列出现数字、字符串、特殊字符和时间格式两种及以上时,就会出现object类型,即便把不同类型的拆分开,仍然是object类型.
如下replace()函数改变数据类型后,用astype()函数再转化一次才能将object格式转化,但有的时候不用.
print(train.info()) train['repay_date'] = train['repay_date'].replace("\\N",'2020-01-01') train['repay_date'] = pd.to_datetime(train['repay_date']) train['repay_amt'] = train['repay_amt'].replace("\\N",0) train['repay_amt'] = train['repay_amt'].astype(float) print(train.info()) # <class 'pandas.core.frame.DataFrame'> # RangeIndex: 1000000 entries, 0 to 999999 # Data columns (total 7 columns): # user_id 1000000 non-null int64 # listing_id 1000000 non-null int64 # due_date 1000000 non-null datetime64[ns] # due_amt 1000000 non-null float64 # repay_date 1000000 non-null object # repay_amt 1000000 non-null object # order_id 1000000 non-null int64 # dtypes: datetime64[ns](1), float64(1), int64(3), object(2) # memory usage: 53.4+ MB # None # <class 'pandas.core.frame.DataFrame'> # RangeIndex: 1000000 entries, 0 to 999999 # Data columns (total 7 columns): # user_id 1000000 non-null int64 # listing_id 1000000 non-null int64 # due_date 1000000 non-null datetime64[ns] # due_amt 1000000 non-null float64 # repay_date 1000000 non-null datetime64[ns] # repay_amt 1000000 non-null float64 # order_id 1000000 non-null int64 # dtypes: datetime64[ns](2), float64(2), int64(3) # memory usage: 53.4 MB # None