『Pandas』數據讀取&DataFrame切片

本文轉載自查看原文 2017-09-05 10:20 2695 Pandas

讀取文件

numpy.loadtxt()

import numpy as np

dataset_filename = "affinity_dataset.txt"

X = np.loadtxt(dataset_filename)

n_samples, n_features = X.shape
print("This dataset has {0} samples and {1} features".format(n_samples, n_features))

This dataset has 100 samples and 5 features

pandas.read_csv()

import pandas as pd

dataset_filename = "affinity_dataset.txt"

Xp = pd.read_csv(dataset_filename, delimiter=' ', names=list('abcde'))

print(Xp.shape)

(100, 5)

檢測一下輸出，

print(X[:5])
print(Xp[:5])
print(type(Xp['a'][0]))

[[ 0.  0.  1.  1.  1.]
 [ 1.  1.  0.  1.  0.]
 [ 1.  0.  1.  1.  0.]
 [ 0.  0.  1.  1.  1.]
 [ 0.  1.  0.  0.  1.]]
   a  b  c  d  e
0  0  0  1  1  1
1  1  1  0  1  0
2  1  0  1  1  0
3  0  0  1  1  1
4  0  1  0  0  1
<class 'numpy.int64'>

DF.loc索引

當每列已有column name時，用 df [ 'a' ] 就能選取出一整列數據。如果你知道column names和index，且兩者都很好輸入，可以選擇 .loc，

print(Xp.loc[0, 'a'], '\n' ,
      Xp.loc[0:3, ['a', 'b']], '\n' ,
      Xp.loc[[1, 5], ['b', 'c']])

DF.iloc索引

如果我們嫌column name太長了，輸入不方便，有或者index是一列時間序列，更不好輸入，那就可以選擇 .iloc了。這邊的 i 我覺得代表index，比較好記點。

print(Xp.iloc[1,1],'\n',
      Xp.iloc[0:3, [0,1]],'\n',  
      Xp.iloc[[0, 3, 5], 0:2]  )

DF.ix索引

.ix 的功能就更強大了，它允許我們混合使用下標和名稱進行選取。可以說它涵蓋了前面所有的用法。基本上把前面的都換成df.ix 都能成功，但是有一點，就是

df.ix [ [ ..1.. ], [..2..] ], 1框內必須統一，必須同時是下標或者名稱，2框也一樣。 BTW， 1框是用來指定row，2框是指定column。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pandas DataFrame數據篩選和切片 python pandas DataFrame數據的分割切片與合並操作 pandas 對數據幀DataFrame中數據的索引及切片操作 pandas之DataFrame創建、索引、切片等基礎操作 pandas.core.frame.DataFrame 切片技巧 05-pandas索引切片讀取數據缺失數據處理 DataFrame索引和切片 pandas數據讀取（DataFrame & Series） Pandas 索引和切片 pandas 刪除、索引及切片