Python csv庫讀取csv文件經常遇到莫名其妙的編碼,報錯信息如下:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 0: invalid start byte
於是造如下輪子,解析各種編碼的csv文件,后續可持續追加各種編碼。
如下:
def read_csv(filename): encodings = ['gbk','utf-8','utf-8-sig','GB2312','gb18030',] for e in encodings: data = [] try: with open(filename, encoding=e) as f: reader = csv.reader(f) header = next(reader) # print(header) for row in reader: data.append(row) print(filename,e) return data except: print(filename,e) print(filename,"==================") return False