[Python] 抽取文件中指定字符串后面的字符串-2


基於上一個版本的改進https://www.cnblogs.com/FiaFia/p/9361225.html

這樣就能把這行文件中需要的信息一個一個提出來,寫成CSV,沒有的值設為空

 

還沒解決的問題是 newsitem = [tradableid[0],exchange[0],symbol[0],name[0],isin[0],currency,instrumentSubType,country,securityType[0],lotSize[0]]  

沒有[0]的 在csv里面就顯示為["309"],有[0]的在csv里面就是值,會好看一些

如果都設置為[0],如果遇到這一行中沒有這個數值,就會報 'IndexError: list index out of range'

之后需要看看用能不能加個lambda 函數判斷是否為空,為空就設置為‘’

 

 

 

import csv
import re


def extract_all(filename):
    result =[]
    
    print('Extracting data----')

    lines = open(filename, 'r', encoding='utf8', errors='ignore').readlines()
    
    for line in lines:
        if "BDt;" in line:
            tradableid = None
            exchange = None
            symbol = None
            name = None
            isin = None
            currency = None
            instrumentSubType = None
            country = None
            securityType = None
            newsitem = None
           
            tradableid = re.findall(r";i(.+?);",line)
            exchange = re.findall(r";Ex(.+?);",line)
            symbol = re.findall(r";SYm(.+?);",line)
            name = re.findall(r";NAm(.+?);",line)
            isin = re.findall(r";ISn(.+?);",line)
            currency = re.findall(r";CUt(.+?);",line)
            instrumentSubType = re.findall (r";INt(.+?);",line)
            country = re.findall(r";CNy(.+?);",line)
            securityType = re.findall(r";STy(.+?);",line)
            lotSize = re.findall(r";LSz(.+?);",line)
             
            newsitem = [tradableid[0],exchange[0],symbol[0],name[0],isin[0],currency,instrumentSubType,country,securityType[0],lotSize[0]]  
            result.append(newsitem)
        
    
    return result


def writetocsv(newsitems, reportfile):
    
    print('Start writing to csv')
    
    if newsitems:
        with open(reportfile, mode='w+', encoding='utf8', errors='ignore') as csvfile:
            writer = csv.writer(csvfile,lineterminator='\n')
            writer.writerow(['TradableID','Exchange','symbol','Name','isin','currenty','instrumentSubType','country','securityType','lotSize']) 
        
            for i in range (0, len(newsitems)):
                writer.writerow(newsitems[i])
   
    print('csv written done')
    return


if __name__ == '__main__':
    filename = 'XSTO_Stock_BasicData0725'
    data_file = filename  +'.tip'
    reportcsv = 'BDt_' + filename + '.csv'
    
    newsitemsum = extract_all(data_file)
    writetocsv(newsitemsum, reportcsv)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM