python讀取CSV文件
python中有一個讀寫csv文件的包,直接import csv即可。利用這個python包可以很方便對csv文件進行操作,一些簡單的用法如下。
1. 讀文件
csv_reader = csv.reader(open('data.file', encoding='utf-8'))
for row in csv_reader:
print(row)
例如有如下的文件

輸出結果如下
['0.093700','0.139771','0.062774','0.007698']
['-0.022711','-0.050504','-0.035691','-0.065434']
['-0.090407','0.021198','0.208712','0.102752']
['-0.085235','0.009540','-0.013228','0.094063']
可見csv_reader把每一行數據轉化成了一個list,list中每個元素是一個字符串。
2. 寫文件
讀文件時,我們把csv文件讀入列表中,寫文件時會把列表中的元素寫入到csv文件中。
list = ['1', '2','3','4']
out = open(outfile, 'w') csv_writer = csv.writer(out) csv_writer.writerow(list)
可能遇到的問題:直接使用這種寫法會導致文件每一行后面會多一個空行。
解決辦法如下:
out = open(outfile, 'w', newline='') csv_writer = csv.writer(out, dialect='excel') csv_writer.writerow(list)
參考如下:
在stackoverflow上找到了比較經典的解釋,原來 python3里面對 str和bytes類型做了嚴格的區分,不像python2里面某些函數里可以混用。所以用python3來寫wirterow時,打開文件不要用wb模式,只需要使用w模式,然后帶上newline=''。
|
In Python 2.X, it was required to open the csvfile with 'b' because the csv module does its own line termination handling. In Python 3.X, the csv module still does its own line termination handling, but still needs to know an encoding for Unicode strings. The correct way to open a csv file for writing is:
|
1 class writer(): 2 def __init__(self): 3 self.dict={ 4 "標題":"標題", 5 "鏈接":"鏈接", 6 "服務":"服務", 7 "dsr":"dsr", 8 "店鋪名":"店鋪名", 9 "價格":"店鋪名", 10 "付款人數":"付款人數", 11 "發貨地":"發貨地" 12 } 13 out = open("outfile.csv", 'w', newline='') 14 self.csv_writer = csv.writer(out, dialect='excel') 15 self.csv_writer.writerow(self.dict) 16 17 def writer_to(self,key_value): 18 self.csv_writer.writerow(key_value) 19 20 21 if __name__ == '__main__': 22 a=writer() 23 new={"鏈接":"http://www.baidu.com",'標題':'我是標題',} 24 a.dict.update(new) 25 print(a.dict) 26 a.writer_to(a.dict.values())
import csv from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import TimeoutException, NoSuchElementException from selenium.webdriver.common.action_chains import ActionChains driver=['1','2'] colspan=['1','2'] try: out = open('類目.csv', 'w', newline='') except PermissionError: print('文件被其他程序占用') input('') csv_writer = csv.writer(out, dialect='excel') csv_writer.writerow(['寶貝ID','類目']) def open_chrome(): driver[0]=webdriver.Chrome() driver[0].get('https://www.dianchacha.com') input('請登陸后按回車:') def EC_located(one_group,value): ''' 目的:簡化代碼長度,參數1選擇one或者group切換選中模式 :param value:要找的值【CSS選擇器】 :return:選擇到的對象 ''' wait = WebDriverWait(driver[0], 10) if one_group=="one": try: ecl=wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,value))) return ecl except TimeoutException: print(value,'1元素未加載成功,等待超時') else: try: ecl=wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,value))) return ecl except TimeoutException: print(value,'1元素---組---未加載成功,等待超時') def operating(ID): #先獲取ID輸入框 driver[0].get('https://www.dianchacha.com/item/info/index/iid/'+ID) html=driver[0].page_source if '未能找到親的寶貝' not in html: colspans=EC_located('group','.colspan-1') colspan[0]=str(colspans[1].text).replace('寶貝類目: ','') else: return operating(ID) print(colspan) def writer_txt(): csv_writer.writerow([url[0],colspan[0]]) print('保存',url[0],colspan[0],'成功') url=['0','1'] def main(): open_chrome() file = '寶貝ID.txt' with open(file) as f: for line in f.readlines(): url[0] = line print(line) operating(url[0]) writer_txt() out.close() print('已完成') if __name__ == '__main__': main()
