Python 從大型csv文件中提取感興趣的行


幫妹子處理一個2.xG 大小的 csv文件,文件太大,不宜一次性讀入內存,可以使用open迭代器。

with open(filename,'r') as file
     # 按行讀取
     for line in file:
        process

或者簡單點

for line in open('myfile.txt','r'):
     pass

需求是,提取時間在指定時間段的數據,另存一個文件。

全部代碼如下

def is_between_time(str, start, end):
    """
    :param str: a line in data file :  8684496663,粵BC5948,2016-01-01 22:01:56,114.083448,22.531582,225,0,0,0,114075022530,114070022530,114.078316,22.534267,1463910,2016-01-01 22:25:59.772000
    :param start: start point for example: 21:57:00
    :param end: end point for example: 22:03:00
    :return:
    """
    fields = str.split(',')
    datetime = fields[2]
    time = datetime.split(' ')[1]
    if time > start and time < end:
        return True
    else:
        return False


file_to_read_path = "E:/P_CZCGPS_20160101.csv"
file_to_write = open("E:/result.csv", 'w')

# read file and process
with open(file_to_read_path,'r') as file:
    for line in file:
        if is_between_time(line, "21:57:00", "22:03:00"):
            print(line)
            file_to_write.write(line)

file_to_write.close()

1024節日快樂!


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM