相關內容簡體繁體

python實現excel數據處理

本文轉載自查看原文 2022-03-16 20:28 8111 工具類庫

python xlrd讀取excel(表格)詳解

安裝：

pip install xlrd

官網地址:

https://xlrd.readthedocs.io/

介紹：

為開發人員提供一個庫，用於從Microsoft Excel（tm）電子表格文件中提取數據。

快速使用xlrd

import xlrd

book = xlrd.open_workbook("myfile.xls") print("當前excel文件工作表數量為 {0}".format(book.nsheets)) print("工作表名字為: {0}".format(book.sheet_names())) # 獲取第一張工作表 sh = book.sheet_by_index(0) # 獲取表的數量 print(book.nsheets) # 當前工作表名, 總行數 總列數 print("{0} {1} {2}".format(sh.name, sh.nrows, sh.ncols)) # 單元 d30 數據為 print("Cell D30 is {0}".format(sh.cell_value(rowx=29, colx=3))) # 獲取所有行數據 for rx in range(sh.nrows):   # rx 行   print(sh.row(rx))    >>>  [text:'Camille Richardson', text:'2316 EVIAN CT', empty:'', empty:'', text:'DISTRICT HEIGHTS', text:'MD', text:'20747-1153', text:'US']   # 獲取所有行數據      for rx in range(sh.nrows):     print(sh.row_values(rx))      >>> ['Camille Richardson', '2316 EVIAN CT', '', '', 'DISTRICT HEIGHTS', 'MD', '20747-1153', 'US']

常用方法：

獲取工作表名稱、行數、列數

工作表名字：table.name
表行數：table.nrows
表列數：table.ncols

獲取sheet

獲取所有sheet名字：book.sheet_names()
獲取sheet數量：book.nsheets
獲取所有sheet對象：book.sheets()
通過sheet名查找：book.sheet_by_name("demo”)
通過索引查找：book.sheet_by_index(0)

獲取sheet的匯總數據：

獲取sheet名：sheet1.name
獲取總行數：sheet1.nrows
獲取總列數：sheet1.ncols

單元格批量讀取：

行操作：

sheet1.row_values(0) # 獲取第一行所有內容，合並單元格，首行顯示值，其它為空。
sheet1.row(0) 　　# 獲取單元格值類型和內容
sheet1.row_types(0) # 獲取單元格數據類型

列操作

sheet1.row_values(0, 6, 10) # 取第1行，第6~10列（不含第10表）
sheet1.col_values(0, 0, 5) # 取第1列，第0~5行（不含第5行）
sheet1.row_slice(2, 0, 2) # 獲取單元格值類型和內容
sheet1.row_types(1, 0, 2) # 獲取單元格數據類型

特定單元格讀取：

獲取單元格值：

sheet1.cell_value(1, 2)
sheet1.cell(1, 2).value
sheet1.row(1)[2].value

獲取單元格類型：

sheet1.cell(1, 2).ctype
sheet1.cell_type(1, 2)
sheet1.row(1)[2].ctype

xlrd 常用函數

# 打開excel表，是否帶格式 book = xlrd.open_workbook("地址信息.xlsx",formatting_info=True/False) # 獲取excel中所有的sheet book.sheets() # 打開具體sheet工作方法1 sheet = book.sheet_by_index(索引位置) # 打開具體sheet工作方法2 sheet = book.sheet_by_nam(工作表名字) # 獲取單元格的值1 sheet.cell_value(rowx=行, colx=列) # 獲取單元格的值2 sheet.cell(行,列).value # 獲取單元格的值3 sheet.cell(行)[列].value  # 獲取第4行的內容，以列表形式表示 row_4 = table.row_values(3) # 獲取所有工作表的名字 book.sheet_names() # 獲取工作表的數量 book.nsheets # 獲取工作表的所有行數 sheet.nrows # 獲取工作表的所有列數 sheet.ncols

python xlwd對excel(表格)寫入詳解

xlwd是一個專門對excel寫入的數據。是xlrd的孿生兄弟，一個負責讀取excel，一個負責對excel寫入數據

安裝

pip install xlwd

官方網址：

https://xlwt.readthedocs.io/en/latest/

https://github.com/python-excel/xlwt

快速入門

import xlwt from datetime import datetime # 設置樣式 字體name Times New Roman 字體顏色為紅色 數字格式為：#,##0.00 style0 = xlwt.easyxf('font: name Times New Roman, color-index red, bold on',     num_format_str='#,##0.00') # 設置樣式 日期格式為D-MMM-YY     style1 = xlwt.easyxf(num_format_str='D-MMM-YY') # 新建工作簿 wb = xlwt.Workbook() # 新建工作表 ws = wb.add_sheet('A Test Sheet') # 向某行某列寫入數據 ws.write(0, 0, 1234.56, style0) ws.write(1, 0, datetime.now(), style1) ws.write(2, 0, 1) ws.write(2, 1, 1) ws.write(2, 2, xlwt.Formula("A3+B3")) # 保存工作表 wb.save('1.xls')

python 字典列表list of dictionary保存成csv

csv文件使用逗號分割，是一種純文本格式，不能指定字體顏色等樣式，也不能指定單元格的寬高，不能合並單元格，沒有多個工作表等功能，可以使用Excel打開。使用csv模塊可以把一些數據做成表格等處理，非常方便。

csv常用方法

csv.reader(f)   讀取csv文件，f為打開csv文件的文件對象，返回的本質是一個迭代器，具有__next__(),__iter__()方法 csv.writer(f)   寫入csv文件 csv.DictReader(f)    類字典方式讀取csv文件 csv.DictWriter(f)    類字典方式寫入csv文件

導出示例

from csv import DictWriter players = [{'dailyWinners': 3, 'dailyFreePlayed': 2, 'user': 'Player1', 'bank': 0.06}, {'dailyWinners': 3, 'dailyFreePlayed': 2, 'user': 'Player2', 'bank': 4.0}, {'dailyWinners': 1, 'dailyFree': 2, 'user': 'Player3', 'bank': 3.1},             {'dailyWinners': 3, 'dailyFree': 2, 'user': 'Player4', 'bank': 0.32}] fileds_names=('dailyWinners','dailyFreePlayed','dailyFree','user','bank') with open('spreadsheet.csv','w',encoding='utf-8') as outfile:     writer = DictWriter(outfile, fileds_names)     writer.writeheader()     writer.writerows(players)

示例：

:.nf 輸出小數點保留n位小數 n為一個整數

:.2f 輸出小數點保留2位小數

import math     math.pi >>>  3.141592653589793 print(f"圓周率小數點保持兩位：{pi:.2f}")                                 >>> 圓周率小數點保持兩位：3.14

:> 左對齊

:< 右對齊

:^ 居中對齊

requests 上傳文件示例

import requests files = {'upload_file': open('file.txt','rb')} values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'} r = requests.post(url, files=files, data=values)

封裝寫法

import os import requests def upload_file(file_path): if os.path.exists(file_path): raise ValueError(f"file_path:{file_path} is error") files = {'upload_file': open(file_path,'rb')} values = {'DB': 'photcat', 'OUT': 'csv', 'SHORT': 'short'}   r = requests.post(url, files=files, data=values)

python讀取xlrd

xlrd模塊實現對excel文件內容讀取，xlwt模塊實現對excel文件的寫入。

安裝

 
                 pip install xlrd 
                
                 pip install xlwt

xlrd模塊使用

excel文檔名稱為聯系人.xls，內容如下：

(1) 打開excel文件並獲取所有sheet

 
                 import  
                 xlrd 
                
                 # 打開Excel文件讀取數據 
                
                 data  
                 =  
                 xlrd.open_workbook( 
                 '聯系人.xls' 
                 ) 
                
                 sheet_name  
                 =  
                 data.sheet_names()   
                 # 獲取所有sheet名稱 
                
                 print 
                 (sheet_name)  
                 # ['銀行2', '銀行3']

(2) 根據下標獲取sheet名稱

 
                 # 根據下標獲取sheet名稱 
                
                 sheet2_name  
                 =  
                 data.sheet_names()[ 
                 1 
                 ] 
                
                 print 
                 (sheet2_name)   
                 # '銀行3'

(3) 根據sheet索引或者名稱獲取sheet內容，同時獲取sheet名稱、行數、列數

 
                 # 根據sheet索引或者名稱獲取sheet內容，同時獲取sheet名稱、列數、行數 
                
                 sheet2  
                 =  
                 data.sheet_by_index( 
                 1 
                 ) 
                
                 print 
                 ( 
                 'sheet2名稱:{}\nsheet2列數: {}\nsheet2行數: {}' 
                 . 
                 format 
                 (sheet2.name, sheet2.ncols, sheet2.nrows)) 
                
                 # sheet2名稱:銀行3 
                
                 # sheet2列數: 7 
                
                 # sheet2行數: 5 
                
                 sheet1  
                 =  
                 data.sheet_by_name( 
                 '銀行2' 
                 ) 
                
                 print 
                 ( 
                 'sheet1名稱:{}\nsheet1列數: {}\nsheet1行數: {}' 
                 . 
                 format 
                 (sheet1.name, sheet1.ncols, sheet1.nrows)) 
                
                 # sheet1名稱:銀行2 
                
                 # sheet1列數: 8 
                
                 # sheet1行數: 6

　(4) 根據sheet名稱獲取整行和整列的值

 
                 #  根據sheet名稱獲取整行和整列的值 
                
                 sheet1  
                 =  
                 data.sheet_by_name( 
                 '銀行2' 
                 ) 
                
                 print 
                 (sheet1.row_values( 
                 3 
                 ))   
                
                 # ['', '張2', '開發', 'IT編碼', 999.0, 133111.0, 41463.0, 'zhang2@164.com'] 日期2013/7/7，實際卻顯示為浮點數41463.0 
                
                 print 
                 (sheet1.col_values( 
                 3 
                 ))  
                
                 # ['', '工作職責', '', 'IT編碼', '網絡維修', '']

　(5）獲取指定單元格的內容

 
                 # 獲取指定單元格的內容 
                
                 print 
                 (sheet1.cell( 
                 1 
                 , 
                 0 
                 ).value)   
                 # 第2 行1列內容：機構名稱 
                
                 print 
                 (sheet1.cell_value( 
                 1 
                 , 
                 0 
                 ))   
                 # 第2 行1列內容：機構名稱 
                
                 print 
                 (sheet1.row( 
                 1 
                 )[ 
                 0 
                 ].value)   
                 # 第2 行1列內容：機構名稱

(6）獲取單元格內容的數據類型

 
                 # 獲取單元格內容的數據類型 
                
                 print 
                 (sheet1.cell( 
                 1 
                 , 
                 0 
                 ).ctype)   
                 # 第2 行1列內容 ：機構名稱為string類型 
                
                 print 
                 (sheet1.cell( 
                 3 
                 , 
                 4 
                 ).ctype)   
                 # 第4行5列內容：999 為number類型 
                
                 print 
                 (sheet1.cell( 
                 3 
                 , 
                 6 
                 ).ctype)   
                 # 第4 行7列內容：2013/7/8 為date類型 
                
                 # 說明：ctype : 0 empty,1 string, 2 number, 3 date, 4 boolean, 5 error

(7）獲取單元內容為日期類型的方式

使用xlrd的xldate_as_tuple處理為date格式

 
            
             
               
               
                 from  
                 datetime  
                 import  
                 datetime,date 
                

                    
                
 
                 if  
                 sheet1.cell( 
                 3 
                 , 
                 6 
                 ).ctype  
                 = 
                 =  
                 3  
                 : 
                
 
                      
                 print 
                 (sheet1.cell( 
                 3 
                 ,  
                 6 
                 ).value)   
                 # 41463.0 
                
 
                      
                 date_value  
                 =  
                 xlrd.xldate_as_tuple(sheet1.cell( 
                 3 
                 ,  
                 6 
                 ).value, data.datemode) 
                
 
                      
                 print 
                 (date_value)   
                 # (2013, 7, 8, 0, 0, 0) 
                
 
                      
                 print 
                 (date( 
                 * 
                 date_value[: 
                 3 
                 ]))  
                 # 2013-07-08 
                
 
                      
                 print 
                 (date( 
                 * 
                 date_value[: 
                 3 
                 ]).strftime( 
                 '%Y/%m/%d' 
                 ))   
                 # 2013/07/08 
                
 
             
 
            
          

(8）獲取單元內容為number的方式（轉為整型）

 
                 if  
                 sheet1.cell( 
                 3 
                 ,  
                 5 
                 ).ctype  
                 = 
                 =  
                 2 
                 : 
                
                 print 
                 (sheet1.cell( 
                 3 
                 ,  
                 5 
                 ).value)   
                 # 133111.0 
                
                 num_value  
                 =  
                 int 
                 (sheet1.cell( 
                 3 
                 ,  
                 5 
                 ).value) 
                
                 print 
                 (num_value)   
                 # 133111

(9) 獲取合並單元格的內容　

需要merged_cells屬性

 
                 # 這里，需要在讀取文件的時候添加個參數，將formatting_info參數設置為True，默認是False，否 
                
                 # 則可能調用merged_cells屬性獲取到的是空值。<br> 
                
                 data  
                 =  
                 xlrd.open_workbook( 
                 '聯系人.xls' 
                 ,formatting_info 
                 = 
                 True 
                 ) 
                
                 sheet1  
                 =  
                 data.sheet_by_name( 
                 '銀行2' 
                 ) 
                
                 print 
                 (sheet1.merged_cells)   
                 # [(0, 1, 0, 8), (2, 6, 0, 1)]<br> 
                
                 # merged_cells返回的這四個參數的含義是：(row,row_range,col,col_range),其中[row,row_range)包括row, 
                
                 # 不包括row_range,col也是一樣，下標從0開始。 
                
                 #(0, 1, 0, 8) 表示1列-8列合並 (2, 6, 0, 1)表示3行-6行合並<br> 
                
                 # 分別獲取合並2個單元格的內容： 
                
                 print 
                 (sheet1.cell( 
                 0 
                 , 
                 0 
                 ).value)   
                 # 銀行2 
                
                 print 
                 (sheet1.cell_value( 
                 2 
                 ,  
                 0 
                 ))   
                 # 銀行2

規律 ： 獲取merge_cells返回的row和col低位的索引即可！

使用以下方法更加方便

 
                 merge_value  
                 =  
                 [] 
                
                 for  
                 (row,row_range,col,col_range)  
                 in  
                 sheet1.merged_cells: 
                
                 merge_value.append((row,col)) 
                
                 print 
                 (merge_value)   
                 # [(0, 0), (2, 0)] 
                
                 for  
                 v  
                 in  
                 merge_value: 
                
                 print 
                 (sheet1.cell(v[ 
                 0 
                 ], v[ 
                 1 
                 ]).value) 
                
                 # 銀行2 
                
                 # 銀行2

xlwt模塊

 
                 import  
                 xlwt 
                
                 from  
                 datetime  
                 import  
                 datetime,date 
                
                 def  
                 set_style(name, height, bold 
                 = 
                 False 
                 , format_str 
                 = 
                 ''): 
                
                 style  
                 =  
                 xlwt.XFStyle()   
                 # 初始化樣式 
                
                 font  
                 =  
                 xlwt.Font()   
                 # 為樣式創建字體 
                
                 font.name  
                 =  
                 name   
                 # 'Times New Roman' 
                
                 font.bold  
                 =  
                 bold 
                
                 font.height  
                 =  
                 height 
                
                 borders 
                 =  
                 xlwt.Borders()  
                 # 為樣式創建邊框 
                
                 borders.left 
                 =  
                 6 
                
                 borders.right 
                 =  
                 6 
                
                 borders.top 
                 =  
                 6 
                
                 borders.bottom 
                 =  
                 6 
                
                 style.font  
                 =  
                 font 
                
                 style.borders  
                 =  
                 borders 
                
                 style.num_format_str 
                 =  
                 format_str 
                
                 return  
                 style 
                
                 wb  
                 =  
                 xlwt.Workbook() 
                
                 ws  
                 =  
                 wb.add_sheet( 
                 'A Test Sheet' 
                 )  
                 # 增加sheet 
                
                 ws.col( 
                 0 
                 ).width  
                 =  
                 200 
                 * 
                 30  
                 # 設置第一列列寬 
                
                 ws.write( 
                 0 
                 ,  
                 0 
                 ,  
                 1234.56 
                 ,set_style( 
                 'Times New Roman' 
                 , 
                 220 
                 ,bold 
                 = 
                 True 
                 ,format_str 
                 = 
                 '#,##0.00' 
                 )) 
                
                 ws.write( 
                 1 
                 ,  
                 0 
                 , datetime.now(), set_style( 
                 'Times New Roman' 
                 , 
                 220 
                 ,bold 
                 = 
                 False 
                 , format_str 
                 = 
                 'DD-MM-YYYY' 
                 )) 
                
                 styleOK  
                 =  
                 xlwt.easyxf( 
                 'pattern: fore_colour light_blue;' 
                
                 'font: colour green, bold True;' 
                 ) 
                
                 pattern  
                 =  
                 xlwt.Pattern() 
                 #一個實例化的樣式類 
                
                 pattern.pattern  
                 =  
                 xlwt.Pattern.SOLID_PATTERN  
                 # 固定的樣式 
                
                 pattern.pattern_fore_colour  
                 =  
                 xlwt.Style.colour_map[ 
                 'red' 
                 ] 
                 #背景顏色 
                
                 styleOK.pattern  
                 =  
                 pattern 
                
                 ws.write( 
                 2 
                 ,  
                 0 
                 ,  
                 1 
                 ,style 
                 = 
                 styleOK) 
                
                 ws.write( 
                 2 
                 ,  
                 1 
                 ,  
                 1 
                 ) 
                
                 ws.write( 
                 2 
                 ,  
                 2 
                 , xlwt.Formula( 
                 "A3+B3" 
                 )) 
                
                 wb.save( 
                 'example.xls' 
                 )    
                 # 保存xls

聯系人表

 
                 import  
                 xlwt 
                
                 from  
                 datetime  
                 import  
                 datetime, date 
                
                 def  
                 set_style(name, height, bold 
                 = 
                 False 
                 , format_str 
                 = 
                 ' 
                 ',align=' 
                 center'): 
                
                 style  
                 =  
                 xlwt.XFStyle()   
                 # 初始化樣式 
                
                 font  
                 =  
                 xlwt.Font()   
                 # 為樣式創建字體 
                
                 font.name  
                 =  
                 name   
                 # 'Times New Roman' 
                
                 font.bold  
                 =  
                 bold 
                
                 font.height  
                 =  
                 height 
                
                 borders  
                 =  
                 xlwt.Borders()   
                 # 為樣式創建邊框 
                
                 borders.left  
                 =  
                 2 
                
                 borders.right  
                 =  
                 2 
                
                 borders.top  
                 =  
                 0 
                
                 borders.bottom  
                 =  
                 2 
                
                 alignment  
                 =  
                 xlwt.Alignment()   
                 # 設置排列 
                
                 if  
                 align 
                 = 
                 =  
                 'center' 
                 : 
                
                 alignment.horz  
                 =  
                 xlwt.Alignment.HORZ_CENTER 
                
                 alignment.vert  
                 =  
                 xlwt.Alignment.VERT_CENTER 
                
                 else 
                 : 
                
                 alignment.horz  
                 =  
                 xlwt.Alignment.HORZ_LEFT 
                
                 alignment.vert  
                 =  
                 xlwt.Alignment.VERT_BOTTOM 
                
                 style.font  
                 =  
                 font 
                
                 style.borders  
                 =  
                 borders 
                
                 style.num_format_str  
                 =  
                 format_str 
                
                 style.alignment  
                 =  
                 alignment 
                
                 return  
                 style 
                
                 wb  
                 =  
                 xlwt.Workbook() 
                
                 ws  
                 =  
                 wb.add_sheet( 
                 '聯系人' 
                 ,cell_overwrite_ok 
                 = 
                 True 
                 )   
                 # 增加sheet 
                
                 rows  
                 =  
                 [ 
                 '機構名稱' 
                 ,  
                 '姓名' 
                 ,  
                 '部門' 
                 ,  
                 '電話' 
                 ,  
                 '入職日期' 
                 ,  
                 '手機' 
                 ,  
                 '郵箱' 
                 ] 
                
                 col1  
                 =  
                 [ 
                 '王1' 
                 ,  
                 '王2' 
                 ,  
                 '王3' 
                 ] 
                
                 col2  
                 =  
                 [ 
                 '666' 
                 ,  
                 '777' 
                 , 
                 '888' 
                 ] 
                
                 col3  
                 =  
                 [ 
                 '2014-08-09' 
                 , 
                 '2014-08-11' 
                 , 
                 '2015-08-09' 
                 ] 
                
                 # 寫第一行數據 
                
                 ws.write_merge( 
                
                 0 
                 , 
                
                 0 
                 , 
                
                 0 
                 , 
                
                 6 
                 , 
                
                 '聯系人表' 
                 , 
                
                 set_style( 
                
                 'Times New Roman' 
                 , 
                
                 320 
                 , 
                
                 bold 
                 = 
                 True 
                 , 
                
                 format_str 
                 = 
                 ''))   
                 # 合並單元格 
                
                 styleOK  
                 =  
                 xlwt.easyxf() 
                
                 pattern  
                 =  
                 xlwt.Pattern()   
                 # 一個實例化的樣式類 
                
                 pattern.pattern  
                 =  
                 xlwt.Pattern.SOLID_PATTERN   
                 # 固定的樣式 
                
                 pattern.pattern_fore_colour  
                 =  
                 xlwt.Style.colour_map[ 
                 'yellow' 
                 ]   
                 # 背景顏色 
                
                 borders  
                 =  
                 xlwt.Borders()   
                 # 為樣式創建邊框 
                
                 borders.left  
                 =  
                 2 
                
                 borders.right  
                 =  
                 2 
                
                 borders.top  
                 =  
                 6 
                
                 borders.bottom  
                 =  
                 2 
                
                 font  
                 =  
                 xlwt.Font()   
                 # 為樣式創建字體 
                
                 font.name  
                 =  
                 'Times New Roman' 
                
                 font.bold  
                 =  
                 True 
                
                 font.height  
                 =  
                 220 
                
                 styleOK.pattern  
                 =  
                 pattern 
                
                 styleOK.borders  
                 =  
                 borders 
                
                 styleOK.font  
                 =  
                 font 
                
                 # 寫第二行數據 
                
                 for  
                 index, val  
                 in  
                 enumerate 
                 (rows): 
                
                 ws.col(index).width  
                 =  
                 150  
                 *  
                 30  
                 # 定義列寬 
                
                 ws.write( 
                 1 
                 , index, val, style 
                 = 
                 styleOK) 
                
                 # 寫第3行-6行第一列數據 
                
                 ws.write_merge( 
                
                 2 
                 , 
                
                 2  
                 +  
                 len 
                 (col1) 
                 - 
                 1 
                 , 
                
                 0 
                 , 
                
                 0 
                 , 
                
                 'x機構' 
                 , 
                
                 set_style( 
                
                 'Times New Roman' 
                 , 
                
                 320 
                 , 
                
                 bold 
                 = 
                 True 
                 , 
                
                 format_str 
                 = 
                 ''))   
                 # 合並單元格 
                
                 # 從第3行開始寫1列數據 
                
                 for  
                 index, val  
                 in  
                 enumerate 
                 (col1): 
                
                 ws.col( 
                 1 
                 ).width  
                 =  
                 150  
                 *  
                 30  
                 # 定義列寬 
                
                 ws.write(index 
                 + 
                 2 
                 ,  
                 1 
                 , val, style 
                 = 
                 set_style( 
                 'Times New Roman' 
                 , 
                
                 200 
                 , 
                
                 bold 
                 = 
                 False 
                 , 
                
                 format_str 
                 = 
                 ' 
                 ',align=' 
                 ')) 
                
                 # 從第3行開始寫4列數據 
                
                 for  
                 index, val  
                 in  
                 enumerate 
                 (col2): 
                
                 ws.col( 
                 3 
                 ).width  
                 =  
                 150  
                 *  
                 30  
                 # 定義列寬 
                
                 ws.write(index 
                 + 
                 2 
                 ,  
                 3 
                 , val, style 
                 = 
                 set_style( 
                 'Times New Roman' 
                 , 
                
                 200 
                 , 
                
                 bold 
                 = 
                 False 
                 , 
                
                 format_str 
                 = 
                 ' 
                 ',align=' 
                 ')) 
                
                 # 從第3行開始寫5列數據 
                
                 for  
                 index, val  
                 in  
                 enumerate 
                 (col3): 
                
                 ws.col( 
                 4 
                 ).width  
                 =  
                 150  
                 *  
                 30  
                 # 定義列寬 
                
                 ws.write(index 
                 + 
                 2 
                 ,  
                 4 
                 , val, style 
                 = 
                 set_style( 
                 'Times New Roman' 
                 , 
                
                 200 
                 , 
                
                 bold 
                 = 
                 False 
                 , 
                
                 format_str 
                 = 
                 ' 
                 ',align=' 
                 ')) 
                
                 ws.write( 
                 4 
                 ,  
                 2 
                 , 
                 '技術部' 
                 , style 
                 = 
                 styleOK) 
                
                 ws.write( 
                 4 
                 ,  
                 5 
                 , 
                 '186777233' 
                 , style 
                 = 
                 styleOK) 
                
                 ws.write( 
                 4 
                 ,  
                 6 
                 , 
                 'wang@166.com' 
                 , style 
                 = 
                 styleOK) 
                
                 wb.save( 
                 'test.xls' 
                 )    
                 # 保存xls

【Python】如何處理Excel中的數據

我們平時在做自動化測試的時候，可能會涉及到從表格中去讀取或者存儲數據，我們除了可以使用openpyxl來操作excel，當然也可以利用pandas來完成，這篇隨筆只是我在學習過程中的簡單記錄，其他的功能還需要繼續去探索。

一、pandas的安裝：

　　1.安裝pandas其實是非常簡單的，pandas依賴處理Excel的xlrd模塊，所以我們需要提前安裝這個，安裝命令是：pip install xlrd

　　2.開始安裝pandas，安裝命令是：pip install pandas

二、讀取excel文件

webservice_testcase.xlsx結構如下：

1.首先我們應該先將這個模塊導入

 
                  import  pandas   
                  as  
                  pd

2.讀取表單中的數據：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')#這個會直接默認讀取到這個Excel的第一個表單
data=sheet.head()#默認讀取前5行數據
print("獲取到所有的值:\n{0}".format(data))#格式化輸出

3.也可以通過指定表單名來讀取數據

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx',sheet_name='userRegister')
data=sheet.head()#默認讀取前5行數據
print("獲取到所有的值:\n{0}".format(data))#格式化輸出

4.通過表單索引來指定要訪問的表單，0表示第一個表單,也可以采用表單名和索引的雙重方式來定位表單,也可以同時定位多個表單，方式都羅列如下所示

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx',sheet_name=['sendMCode','userRegister'])#可以通過表單名同時指定多個
# sheet=pd.read_excel('test_data\\webservice_testcase.xlsx',sheet_name=0)#可以通過表單索引來指定讀取的表單
# sheet=pd.read_excel('test_data\\webservice_testcase.xlsx',sheet_name=['sendMCode',1])#可以混合的方式來指定
# sheet=pd.read_excel('test_data\\webservice_testcase.xlsx',sheet_name=[1,2])#可以通過索引 同時指定多個
data=sheet.values#獲取所有的數據，注意這里不能用head()方法
print("獲取到所有的值:\n{0}".format(data))#格式化輸出

二、操作Excel中的行列

1.讀取制定的某一行數據:

sheet=pd.read_excel('webservice_testcase.xlsx')#這個會直接默認讀取到這個Excel的第一個表單
data=sheet.ix[0].values#0表示第一行 這里讀取數據並不包含表頭
print("讀取指定行的數據：\n{0}".format(data))

得到了如下結果：

2.讀取指定的多行：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')#這個會直接默認讀取到這個Excel的第一個表單
data=sheet.ix[[0,1]].values#0表示第一行 這里讀取數據並不包含表頭
print("讀取指定行的數據：\n{0}".format(data))

得到了如下的結果：

3.讀取指定行列的數據：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')#這個會直接默認讀取到這個Excel的第一個表單
data=sheet.ix[0,1]#讀取第一行第二列的值
print("讀取指定行的數據：\n{0}".format(data))

得到了如下結果：

4.讀取指定的多行多列的值：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
data=sheet.ix[[1,2],['method','description']].values#讀取第二行第三行的method以及description列的值，這里需要嵌套列表
print("讀取指定行的數據：\n{0}".format(data))

得到了如下的結果：

5.讀取所有行指定的列的值：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
data=sheet.ix[:,['method','description']].values#讀取第二行第三行的method以及description列的值，這里需要嵌套列表
print("讀取指定行的數據：\n{0}".format(data))

得到了如下的結果：

6.獲取行號輸出：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
print("輸出行號列表",sheet.index.values)

得到了如下的結果：

7.獲取列名輸出：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
print("輸出列標題",sheet.columns.values)

得到了如下的結果：

8.獲取指定行數的值：

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
print("輸出值",sheet.sample(2).values)

9.獲取指定列的值

sheet=pd.read_excel('test_data\\webservice_testcase.xlsx')
print("輸出值",sheet['description'].values)

得到了如下的結果：

三、將excel中的每一條數據處理成字典,然后讓如一個列表中

test_data=[]
sheet = pd.read_excel(self.file_name, sheet_name=key)
for i in sheet.index.values:#獲取行號的索引，並對其進行遍歷：#根據i來獲取每一行指定的數據 並利用to_dict轉成字典
　　row_data=sheet.ix[i,['id','method','description','url','param','ExpectedResult']].to_dict()
　　test_data.append(row_data)

另外，我們可以把測試用例相關的東西寫入一個配置文件當中，讀取的時候可以根據配置文件中的內容來進行讀取：

配置文件如下：

 
                  [CASECONFIG] 
                 
                  sheet_list 
                  = 
                  { 
                  'sendMCode' 
                  : 
                  'all' 
                  , 
                 
                  'userRegister' 
                  : 
                  'all' 
                  , 
                 
                  'verifyUserAuth' 
                  : 
                  'all' 
                  , 
                 
                  'bindBankCard' 
                  :[] 
                 
                  }

配置文件處理.py代碼如下：

import configparser
class ReadConfig:
    def read_config(self,file_path,section,option):
        cf=configparser.ConfigParser()
        cf.read(file_path,encoding="utf-8")
        value=cf.get(section,option)
        return value

project_path.py代碼如下：

import os
Project_path=os.path.split(os.path.split(os.path.realpath(__file__))[0])[0]
#配置文件路徑
case_config_path=os.path.join(Project_path,'config','case.config')
#測試用例的路徑
test_cases_path=os.path.join(Project_path,'test_data','webservice_testcase.xlsx')

然后我們把讀取excel中的內容封裝成一個類，代碼示例如下：

from common import project_pathfrom common.read_config import ReadConfig as RC
import  pandas  as pd

class DoExcel:
    def __init__(self,file_name):
        self.file_name=file_name
        self.sheet_list=eval(RC().read_config(project_path.case_config_path,'CASECONFIG','sheet_list'))
    def do_excel(self):
        test_data=[]
        for key in self.sheet_list:
            if self.sheet_list[key] == 'all':  # 讀取所有的用例
                sheet = pd.read_excel(self.file_name, sheet_name=key)
                for i in sheet.index.values:#獲取行號的索引，並對其進行遍歷：
                    #根據i來獲取每一行指定的數據 並利用to_dict轉成字典
                    row_data=sheet.ix[i,['id','method','description','url','param','ExpectedResult']].to_dict()
                    test_data.append(row_data)
　　　　　　　else:
                sheet = pd.read_excel(self.file_name, sheet_name=key)
                for i in self.sheet_list[key]:#根據list中的標號去讀取excel指定的用例
                    row_data=sheet.ix[i-1,['id','method','description','url','param','ExpectedResult']].to_dict()
                    test_data.append(row_data)
        return  test_data
if __name__ == '__main__':
    test_data=DoExcel(project_path.test_cases_path).do_excel()
    print(test_data)

如果將配置改成如下內容：

[CASECONFIG]
sheet_list={'sendMCode':[1,3,5],
             'userRegister':[],
             'verifyUserAuth':[],
             'bindBankCard':[]
             }

我們將會得到如下的運行結果：

[{'id': 1, 'method': 'sendMCode', 'description': '正常流程', 'url': 'http://120.24.235.105:9010/sms-service-war-1.0/ws/smsFacade.ws?wsdl', 'param': '{"client_ip":"172.16.168.202","tmpl_id":1,"mobile":"${tel}"}', 'ExpectedResult': '(result){\n   retCode = "0"\n   retInfo = "ok"\n }'}, 
{'id': 3, 'method': 'sendMCode', 'description': '手機號為空', 'url': 'http://120.24.235.105:9010/sms-service-war-1.0/ws/smsFacade.ws?wsdl', 'param': '{"client_ip":"172.16.168.202","tmpl_id":1,"mobile":""}', 'ExpectedResult': "Server raised fault: '手機號碼錯誤'"}, 
{'id': 5, 'method': 'sendMCode', 'description': 'ip地址為空', 'url': 'http://120.24.235.105:9010/sms-service-war-1.0/ws/smsFacade.ws?wsdl', 'param': '{"client_ip":"","tmpl_id":1,"mobile":"${tel}"}', 'ExpectedResult': "Server raised fault: '用戶IP不能為空'"}]

到此，將excel中的用例數據讀取成為[{key1:value1},{key2:value2},...,{keyn:valuen}]這樣的形式已經完畢，但是還有很多東西需要完善，比如用例中完成參數的替換，測試完成后回寫測試數據到excel對應的表格中等等內容。

python 從word/docx中提取鏈接（hyperlink）和文本

import zipfile import re import json import base64 from docx import Document from os.path import basename from docx.opc.constants import RELATIONSHIP_TYPE as RT from bs4 import BeautifulSoup def get_linked_text(soup):     links = []     # This kind of link has a corresponding URL in the _rel file.     for tag in soup.find_all("hyperlink"):         # try/except because some hyperlinks have no id.         try:             links.append({"id": tag["r:id"], "text": tag.text})         except:             pass     # This kind does not.     for tag in soup.find_all("instrText"):         # They're identified by the word HYPERLINK         if "HYPERLINK" in tag.text:             # Get the URL. Probably.             url = tag.text.split('"')[1]             # The actual linked text is stored nearby tags.             # Loop through the siblings starting here.             temp = tag.parent.next_sibling             text = ""             while temp is not None:                 # Text comes in <t> tags.                 maybe_text = temp.find("t")                 if maybe_text is not None:                     # Ones that have text in them.                     if maybe_text.text.strip() != "":                         text += maybe_text.text.strip()                 # Links end with <w:fldChar w:fldCharType="end" />.                 maybe_end = temp.find("fldChar[w:fldCharType]")                 if maybe_end is not None:                     if maybe_end["w:fldCharType"] == "end":                         break                 temp = temp.next_sibling             links.append({"id": None, "href": url, "text": text})     return links if __name__ == '__main__':     file_name="xx.docx"     archive = zipfile.ZipFile(file_name, "r")     file_data = archive.read("word/document.xml")     doc_soup = BeautifulSoup(file_data, "xml")     linked_text = get_linked_text(doc_soup)     print(linked_text)

word讀取圖片，並將圖片轉換成鏈接

from docx import Document from os.path import basename import re def upload_image(image_data):     image_url = "圖片鏈接"     return image_url file_name = "/Users/在文檔頂部.docx" doc = Document(file_name) a = list() pattern = re.compile('rId\d+') for graph in doc.paragraphs:     b = list()     for run in graph.runs:         if run.text != '':             b.append(run.text)         else:             # b.append(pattern.search(run.element.xml))             content_id = pattern.search(run.element.xml).group(0)             try:                 content_type = doc.part.related_parts[content_id].content_type             except KeyError as e:                 print(e)                 continue             if not content_type.startswith('image'):                 continue             img_name = basename(doc.part.related_parts[content_id].partname)             img_data = doc.part.related_parts[content_id].blob             b.append(upload_image(img_data))     b_str=f"{''.join(b)}"     a.append(b_str) print(a)

python re模塊去掉括號及其里面的內容

括號為全角括號

文本：寄令狐綯（一本無綯字）相公

import re text = "寄令狐綯（一本無綯字）相公" text = re.sub("\\（.*?）|\\{.*?}|\\[.*?]|\\【.*?】", "",text) print(text) >>> 寄令狐綯相公

括號為半角括號

import re text = "寄令狐綯(一本無綯字)相公" text = re.sub("\\(.*?\\)|\\{.*?}|\\[.*?]", "",text) print(text) >>> 寄令狐綯相公

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python之Excel表格數據處理 python數據處理 python數據處理（二） python數據處理（一） python數據處理excel和pdf，並打包成exe 【Python】txt數據處理 Python基於pandas的數據處理（二） Python基於pandas的數據處理（一） python ----json數據處理 python pandas 數據處理

python實現excel數據處理

python xlrd讀取excel(表格)詳解

安裝：

官網地址:

介紹：

快速使用xlrd

常用方法：

xlrd 常用函數

python xlwd對excel(表格)寫入詳解

python 字典列表list of dictionary保存成csv

csv常用方法

導出示例

python3 format小數點保持指定位數

requests 上傳文件示例

python讀取xlrd

安裝

xlrd模塊使用

xlwt模塊

【Python】如何處理Excel中的數據

python 從word/docx中提取鏈接（hyperlink）和文本

word讀取圖片，並將圖片轉換成鏈接

python re模塊去掉括號及其里面的內容

免責聲明！