用Python实现大文件分割


python代码如下:

import sys,os kilobytes = 1024 megabytes = kilobytes*1000 chunksize = int(200*megabytes)#default chunksize

def split(fromfile,todir,chunksize=chunksize): if not os.path.exists(todir):#check whether todir exists or not
 os.mkdir(todir) else: for fname in os.listdir(todir): os.remove(os.path.join(todir,fname)) partnum = 0 inputfile = open(fromfile,'rb')#open the fromfile
    while True: chunk = inputfile.read(chunksize) if not chunk:             #check the chunk is empty
            break partnum += 1 filename = os.path.join(todir,('data%04d'%partnum)) fileobj = open(filename,'wb')#make partfile
        fileobj.write(chunk)         #write data into partfile
 fileobj.close() return partnum if __name__=='__main__': fromfile = input('File to be split?') todir = input('Directory to store part files?') chunksize = int(input('Chunksize to be split?')) absfrom,absto = map(os.path.abspath,[fromfile,todir]) print('Splitting',absfrom,'to',absto,'by',chunksize) try: parts = split(fromfile,todir,chunksize) except: print('Error during split:') print(sys.exc_info()[0],sys.exc_info()[1]) else: print('split finished:',parts,'parts are in',absto)

以data.txt文件为例,此文件是由python随机生成的数字构成的数据集,大小为1.1G,现将它等分割成多个128M子文件,运行结果如下:

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM