一.数据导入和导出 (一)读取csv文件 1.本地读取
import pandas as pd df = pd.read_csv('E:\\tips.csv') #根据自己数据文件保存的路径填写(p.s. python填写路径时,要么使用/,要么使用\\)
#输出: total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4 5 25.29 4.71 Male No Sun Dinner 4 .. ... ... ... ... ... ... ... 240 27.18 2.00 Female Yes Sat Dinner 2 241 22.67 2.00 Male Yes Sat Dinner 2 242 17.82 1.75 Male No Sat Dinner 2 243 18.78 3.00 Female No Thur Dinner 2 [244 rows x 7 columns]
2.网络读取
import pandas as pd data_url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv" #填写url读取 df = pd.read_csv(data_url) #输出同上,为了节省篇幅这儿就不粘贴了
3.read_csv详解 功能: Read CSV (comma-separated) file into DataFrame
read_csv(filepath_or_buffer, sep=',', dialect=None, compression='infer', doublequote=True, escapechar=None, quotechar='"', quoting=0, skipinitialspace=False, lineterminator=None, header='infer', index_col=None, names=None, prefix=None, skiprows=None, skipfooter=None, skip_footer=0, na_values=None, true_values=None, false_values=None, delimiter=None, converters=None, dtype=None, usecols=None, engine=None, delim_whitespace=False, as_recarray=False, na_filter=True, compact_ints=False, use_unsigned=False, low_memory=True, buffer_lines=None, warn_bad_lines=True, error_bad_lines=True, keep_default_na=True, thousands=None, comment=None, decimal='.', parse_dates=False, keep_date_col=False, dayfirst=False, date_parser=None, memory_map=False, float_precision=None, nrows=None, iterator=False, chunksize=None, verbose=False, encoding=None, squeeze=False, mangle_dupe_cols=True, tupleize_cols=False, infer_datetime_format=False, skip_blank_lines=True)
参数详解: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html (二)读取Mysql数据 假设数据库安装在本地,用户名为myusername,密码为mypassword,要读取mydb数据库中的数据
import pandas as pd import MySQLdb mysql_cn= MySQLdb.connect(host='localhost', port=3306,user='myusername', passwd='mypassword', db='mydb') df = pd.read_sql('select * from test;', con=mysql_cn) mysql_cn.close()
上面的代码读取了test表中所有的数据到df中,而df的数据结构为Dataframe。
ps.MySQL教程:http://www.runoob.com/mysql/mysql-tutorial.html
(三)读取excel文件
要读取excel文件还需要安装xlrd模块,pip install xlrd即可。
df = pd.read_excel('E:\\tips.xls')
(四)数据导出到csv文件
df.to_csv('E:\\demo.csv', encoding='utf-8', index=False) #index=False表示导出时去掉行名称,如果数据中含有中文,一般encoding指定为‘utf-8’
(五)读写SQL数据库
import pandas as pd import sqlite3 con = sqlite3.connect('...') sql = '...' df=pd.read_sql(sql,con) #help文件 help(sqlite3.connect) #输出 Help on built-in function connect in module _sqlite3: connect(...) connect(database[, timeout, isolation_level, detect_types, factory]) Opens a connection to the SQLite database file *database*. You can use ":memory:" to open a database connection to a database that resides in RAM instead of on disk. ############# help(pd.read_sql) #输出 Help on function read_sql in module pandas.io.sql: read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None) Read SQL query or database table into a DataFrame.
转载链接:https://www.cnblogs.com/zzhzhao/p/5269217.html#undefined