pandas庫簡介


1.pandas庫簡介
      在 Python 自帶的科學計算庫中,Pandas 模塊是最適於數據科學相關操作的工具。它與 Scikit-learn 兩個模塊幾乎提供了數據科學家所需的全部工具。Pandas 是一種開源的、易於使用的數據結構和Python編程語言的數據分析工具。它可以對數據進行導入、清洗、處理、統計和輸出。pandas 是基於 Numpy 庫的,可以說,pandas 庫就是為數據分析而生的。

      根據大多數一線從事機器學習應用的研發人員的經驗,如果問他們究竟在機器學習的哪個環節最耗費時間,恐怕多數人會很無奈地回答您:“數據預處理。”。事實上,多數在業界的研發團隊往往不會投人太多精力從事全新機器學習模型的研究,而是針對具體的項目和特定的數據,使用現有的經典模型進行分析。這樣一來,時間多數被花費在處理數據,甚至是數據清洗的工作上,特別是在數據還相對原始的條件下。Pandas便應運而生,它是一款針對於數據處理和分析的Python工具包,實現了大量便於數據讀寫、清洗、填充以及分析的功能。這樣就幫助研發人員節省了大量用於數據預處理下作的代碼,同時也使得他們有更多的精力專注於具體的機器學習任務。

2.pandas庫安裝

pip install pandas

 

3. pandas庫使用方法

1、函數使用方法
Pickling

  read_pickle(path[, compression]) Load pickled pandas object (or any object) from file.


Flat File

read_table(filepath_or_buffer[, sep, …]) (DEPRECATED) Read general delimited file into DataFrame.
read_csv(filepath_or_buffer[, sep, …]) Read a comma-separated values (csv) file into DataFrame.
read_fwf(filepath_or_buffer[, colspecs, …]) Read a table of fixed-width formatted lines into DataFrame.
read_msgpack(path_or_buf[, encoding, iterator]) Load msgpack pandas object from the specified file path

Clipboard

  read_clipboard([sep]) Read text from clipboard and pass to read_csv.

Excel

read_excel(io[, sheet_name, header, names, …]) Read an Excel file into a pandas DataFrame.
ExcelFile.parse([sheet_name, header, names, …]) Parse specified sheet(s) into a DataFrame
ExcelWriter(path[, engine, date_format, …]) Class for writing DataFrame objects into excel sheets, default is to use xlwt for xls, openpyxl for xlsx.

JSON

read_json([path_or_buf, orient, typ, dtype, …]) Convert a JSON string to pandas object.
json_normalize(data[, record_path, meta, …]) Normalize semi-structured JSON data into a flat table.
build_table_schema(data[, index, …]) Create a Table schema from data.

HTML

  read_html(io[, match, flavor, header, …]) Read HTML tables into a list of DataFrame objects.


HDFStore: PyTables (HDF5)

read_hdf(path_or_buf[, key, mode]) Read from the store, close it if we opened it.
HDFStore.put(key, value[, format, append]) Store object in HDFStore
HDFStore.append(key, value[, format, …]) Append to Table in file.
HDFStore.get(key) Retrieve pandas object stored in file
HDFStore.select(key[, where, start, stop, …]) Retrieve pandas object stored in file, optionally based on where criteria
HDFStore.info() Print detailed information on the store.
HDFStore.keys() Return a (potentially unordered) list of the keys corresponding to the objects stored in the HDFStore.
HDFStore.groups() return a list of all the top-level nodes (that are not themselves a pandas storage object)
HDFStore.walk([where]) Walk the pytables group hierarchy for pandas objects

Feather

  read_feather(path[, columns, use_threads]) Load a feather-format object from the file path

Parquet

  read_parquet(path[, engine, columns]) Load a parquet object from the file path, returning a DataFrame.

SAS

  read_sas(filepath_or_buffer[, format, …]) Read SAS files stored as either XPORT or SAS7BDAT format files.

SQL

read_sql_table(table_name, con[, schema, …]) Read SQL database table into a DataFrame.
read_sql_query(sql, con[, index_col, …]) Read SQL query into a DataFrame.
read_sql(sql, con[, index_col, …]) Read SQL query or database table into a DataFrame.

Google BigQuery

  read_gbq(query[, project_id, index_col, …]) Load data from Google BigQuery.

STATA

read_stata(filepath_or_buffer[, …]) Read Stata file into DataFrame.
StataReader.data(**kwargs) (DEPRECATED) Reads observations from Stata file, converting them into a dataframe
StataReader.data_label() Returns data label of Stata file
StataReader.value_labels() Returns a dict, associating each variable name a dict, associating each value its corresponding label
StataReader.variable_labels() Returns variable labels as a dict, associating each variable name with corresponding label
StataWriter.write_file()
 

————————————————
版權聲明:本文為CSDN博主「一個處女座的程序猿」的原創文章,遵循CC 4.0 BY-SA版權協議,轉載請附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/qq_41185868/article/details/79781561


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM