Python之FTP多線程下載文件之多線程分塊下載文件


Python中的ftplib模塊用於對FTP的相關操作,常見的如下載,上傳等。使用python從FTP下載較大的文件時,往往比較耗時,如何提高從FTP下載文件的速度呢?多線程粉墨登場,本文給大家分享我的多線程下載代碼,需要用到的python主要模塊包括:ftplib和threading。

首先討論我們的下載思路,示意如下:

1. 將文件分塊,比如我們打算采用20個線程去下載同一個文件,則需要將文件以二進制方式打開,平均分成20塊,然后分別啟用一個線程去下載一個塊:

 1 def setupThreads(self, filePath, localFilePath, threadNumber = 20):
 2     """
 3     set up the threads which will be used to download images
 4     list of threads will be returned if success, else
 5     None will be returned
 6     """
 7     try:
 8         temp = self.ftp.sendcmd('SIZE ' + filePath)
 9         remoteFileSize = int(string.split(temp)[1])
10         blockSize = remoteFileSize / threadNumber
11         rest = None
12         threads = []
13         for i in range(0, threadNumber - 1):
14             beginPoint = blockSize * i
15             subThread = threading.Thread(target = self.downloadFileMultiThreads, args = (i, filePath, localFilePath, beginPoint, blockSize, rest,))
16             threads.append(subThread)
17             
18         assigned = blockSize * threadNumber
19         unassigned = remoteFileSize - assigned
20         lastBlockSize = blockSize + unassigned
21         beginPoint = blockSize * (threadNumber - 1)
22         subThread = threading.Thread(target = self.downloadFileMultiThreads, args = (threadNumber - 1, filePath, localFilePath, beginPoint, lastBlockSize, rest,))
23         threads.append(subThread)
24         return threads
25     except Exception, diag:
26         self.recordLog(str(diag), 'error')
27         return None

其中的downloadFileMultiThreads函數如下:

 1 def downloadFileMultiThreads(self, threadIndex, remoteFilePath, localFilePath, \
 2                                  beginPoint, blockSize, rest = None):
 3     """
 4     A sub thread used to download file
 5     """
 6     try:
 7         threadName = threading.currentThread().getName()
 8         # temp local file
 9         fp = open(localFilePath + '.part.' + str(threadIndex), 'wb')
10         callback = fp.write
11         
12         # another connection to ftp server, change to path, and set binary mode
13         myFtp = FTP(self.host, self.user, self.passwd)
14         myFtp.cwd(os.path.dirname(remoteFilePath))
15         myFtp.voidcmd('TYPE I')
16         
17         finishedSize = 0
18         # where to begin downloading
19         setBeginPoint = 'REST ' + str(beginPoint)
20         myFtp.sendcmd(setBeginPoint)
21         # begin to download
22         beginToDownload = 'RETR ' + os.path.basename(remoteFilePath)
23         connection = myFtp.transfercmd(beginToDownload, rest)
24         readSize = self.fixBlockSize
25         while 1:
26             if blockSize > 0:
27                 remainedSize = blockSize - finishedSize
28                 if remainedSize > self.fixBlockSize:
29                     readSize = self.fixBlockSize
30                 else:
31                     readSize = remainedSize
32             data = connection.recv(readSize)
33             if not data:
34                 break
35             finishedSize = finishedSize + len(data)
36             # make sure the finished data no more than blockSize
37             if finishedSize == blockSize:
38                 callback(data)
39                 break
40             callback(data)
41         connection.close()
42         fp.close()
43         myFtp.quit()
44         return True
45     except Exception, diag:
46         return False

2. 等待下載完成之后我們需要對各個文件塊進行合並,合並的過程見本系列之二:Python之FTP多線程下載文件之分塊多線程文件合並

 

感謝大家的閱讀,希望能夠幫到大家!

Published by Windows Live Writer!


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM