django提供文件下載時,若果文件較小,解決辦法是先將要傳送的內容全生成在內存中,然后再一次性傳入Response對象中:
1
2
3
4
|
def
simple_file_download(request):
# do something...
content
=
open
(
"simplefile"
,
"rb"
).read()
return
HttpResponse(content)
|
如果文件非常大時,最簡單的辦法就是使用靜態文件服務器,比如Apache或者Nginx服務器來處理下載。不過有時候,我們需要對用戶的權限做一下限定,或者不想向用戶暴露文件的真實地址,或者這個大內容是臨時生成的(比如臨時將多個文件合並而成的),這時就不能使用靜態文件服務器了。
django文檔中提到,可以向HttpResponse傳遞一個迭代器,流式的向客戶端傳遞數據。
要自己寫迭代器的話,可以用yield:
1
2
3
4
5
6
7
8
9
10
11
12
|
def
read_file(filename, buf_size
=
8192
):
with
open
(filename,
"rb"
) as f:
while
True
:
content
=
f.read(buf_size)
if
content:
yield
content
else
:
break
def
big_file_download(request):
filename
=
"filename"
response
=
HttpResponse(read_file(filename))
return
response
|
或者使用生成器表達式,下面是django文檔中提供csv大文件下載的例子:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
import
csv
from
django.utils.six.moves
import
range
from
django.http
import
StreamingHttpResponse
class
Echo(
object
):
"""An object that implements just the write method of the file-like
interface.
"""
def
write(
self
, value):
"""Write the value by returning it, instead of storing in a buffer."""
return
value
def
some_streaming_csv_view(request):
"""A view that streams a large CSV file."""
# Generate a sequence of rows. The range is based on the maximum number of
# rows that can be handled by a single sheet in most spreadsheet
# applications.
rows
=
([
"Row {0}"
.
format
(idx),
str
(idx)]
for
idx
in
range
(
65536
))
pseudo_buffer
=
Echo()
writer
=
csv.writer(pseudo_buffer)
response
=
StreamingHttpResponse((writer.writerow(row)
for
row
in
rows),
content_type
=
"text/csv"
)
response[
'Content-Disposition'
]
=
'attachment; filename="somefilename.csv"'
return
response
|
python也提供一個文件包裝器,將類文件對象包裝成一個迭代器:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
class
FileWrapper:
"""Wrapper to convert file-like objects to iterables"""
def
__init__(
self
, filelike, blksize
=
8192
):
self
.filelike
=
filelike
self
.blksize
=
blksize
if
hasattr
(filelike,
'close'
):
self
.close
=
filelike.close
def
__getitem__(
self
,key):
data
=
self
.filelike.read(
self
.blksize)
if
data:
return
data
raise
IndexError
def
__iter__(
self
):
return
self
def
next
(
self
):
data
=
self
.filelike.read(
self
.blksize)
if
data:
return
data
raise
StopIteration
|
使用時:
1
2
3
4
5
6
7
8
9
10
|
from
django.core.servers.basehttp
import
FileWrapper
from
django.http
import
HttpResponse
import
os
def
file_download(request,filename):
wrapper
=
FileWrapper(
open
(filename,
'rb'
))
response
=
HttpResponse(wrapper, content_type
=
'application/octet-stream'
)
response[
'Content-Length'
]
=
os.path.getsize(path)
response[
'Content-Disposition'
]
=
'attachment; filename=%s'
%
filename
return
response
|
django也提供了StreamingHttpResponse類來代替HttpResponse對流數據進行處理。
壓縮為zip文件下載:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
import
os, tempfile, zipfile
from
django.http
import
HttpResponse
from
django.core.servers.basehttp
import
FileWrapper
def
send_zipfile(request):
"""
Create a ZIP file on disk and transmit it in chunks of 8KB,
without loading the whole file into memory. A similar approach can
be used for large dynamic PDF files.
"""
temp
=
tempfile.TemporaryFile()
archive
=
zipfile.ZipFile(temp,
'w'
, zipfile.ZIP_DEFLATED)
for
index
in
range
(
10
):
filename
=
__file__
# Select your files here.
archive.write(filename,
'file%d.txt'
%
index)
archive.close()
wrapper
=
FileWrapper(temp)
response
=
HttpResponse(wrapper, content_type
=
'application/zip'
)
response[
'Content-Disposition'
]
=
'attachment; filename=test.zip'
response[
'Content-Length'
]
=
temp.tell()
temp.seek(
0
)
return
response
|
不過不管怎么樣,使用django來處理大文件下載都不是一個很好的注意,最好的辦法是django做權限判斷,然后讓靜態服務器處理下載。
這需要使用sendfile的機制:"傳統的Web服務器在處理文件下載的時候,總是先讀入文件內容到應用程序內存,然后再把內存當中的內容發送給客戶端瀏覽器。這種方式在應付當今大負載網站會消耗更多的服務器資源。sendfile是現代操作系統支持的一種高性能網絡IO方式,操作系統內核的sendfile調用可以將文件內容直接推送到網卡的buffer當中,從而避免了Web服務器讀寫文件的開銷,實現了“零拷貝”模式。 "
Apache服務器里需要mod_xsendfile模塊來實現,而Nginx是通過稱為X-Accel-Redirect的特性來實現。
nginx配置文件:
1
2
3
4
5
6
|
# Will serve /var/www/files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location
/
protected_files {
internal;
alias
/
var
/
www
/
files;
}
|
或者
1
2
3
4
5
6
|
# Will serve /var/www/protected_files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location
/
protected_files {
internal;
root
/
var
/
www;
}
|
注意alias和root的區別。
django中:
1
|
response[
'X-Accel-Redirect'
]
=
'/protected_files/%s'
%
filename
|
這樣當向django view函數發起request時,django負責對用戶權限進行判斷或者做些其它事情,然后向nginx轉發url為/protected_files/filename的請求,nginx服務器負責文件/var/www/protected_files/filename的下載:
1
2
3
4
5
6
7
8
9
10
11
|
@login_required
def
document_view(request, document_id):
book
=
Book.objects.get(
id
=
document_id)
response
=
HttpResponse()
name
=
book.myBook.name.split(
'/'
)[
-
1
]
response[
'Content_Type'
]
=
'application/octet-stream'
response[
"Content-Disposition"
]
=
"attachment; filename={0}"
.
format
(
name.encode(
'utf-8'
))
response[
'Content-Length'
]
=
os.path.getsize(book.myBook.path)
response[
'X-Accel-Redirect'
]
=
"/protected/{0}"
.
format
(book.myBook.name)
return
response
|