python3.6.3中html頁面轉化成pdf

本文轉載自查看原文 2018-02-12 16:23 1336 Python

通常在工作中經常會遇見將某些好的html數據轉換成PDF，便於存儲和閱讀，今天就來看看這個簡單的html轉換PDF模塊：pdfkit

模塊安裝

安裝python-pdfkit模塊:
```
$ pip install pdfkit
```
操作系統安裝wkhtmltopdf模塊:

Debian/Ubuntu:
```
$ sudo apt-get install wkhtmltopdf
```

提醒：在Debian / Ubuntu版本中有減少功能（因為它編譯沒有wkhtmltopdf QT補丁），如添加大綱，頭，頁腳，toc等使用這個選項，你應該從wkhtmltopdf網站安裝靜態二進制文件，或者你可以使用這個腳本.

使用

簡單的例子:

import pdfkit

pdfkit.from_url('http://google.com', 'out.pdf')  #獲取在線的url數據進行轉換生成本地out.pdf文檔
pdfkit.from_file('test.html', 'out.pdf')            #獲取本地html文件進行轉換生成本地out.pdf文檔
pdfkit.from_string('Hello!', 'out.pdf')             #將輸入的文本轉換生成本地out.pdf文檔

可以將多個url或者文件放到一個列表中進行轉換:

pdfkit.from_url(['google.com', 'yandex.ru', 'engadget.com'], 'out.pdf')
pdfkit.from_file(['file1.html', 'file2.html'], 'out.pdf')

打開文件讀取數據進行轉換:

with open('file.html') as f:
    pdfkit.from_file(f, 'out.pdf')

如果你想進一步生成 PDF, 你可以傳遞一個參數:

# 使用  False 代替輸出保存一個可變的PDF
pdf = pdfkit.from_url('http://google.com', False)

你可以指定 wkhtmltopdf 選項. 在名稱中你可以刪除 ‘–’. 如果選擇沒有值, 使用None, False or “”，對於重復的選擇 (允許，cookie，自定義標題，發布，postfile，運行腳本，替換) 在多個值的時候你可能會用到列表或者元組進行存儲 (列入自定義頭文件授權信息) 你需要兩個元組存放 (看看以下例子).

options = {
    'page-size': 'Letter',
    'margin-top': '0.75in',
    'margin-right': '0.75in',
    'margin-bottom': '0.75in',
    'margin-left': '0.75in',
    'encoding': "UTF-8",
    'custom-header' : [
        ('Accept-Encoding', 'gzip')
    ]
    'cookie': [
        ('cookie-name1', 'cookie-value1'),
        ('cookie-name2', 'cookie-value2'),
    ],
    'no-outline': None
}

pdfkit.from_url('http://google.com', 'out.pdf', options=options)

默認情況, PDFKit 會顯示 wkhtmltopdf 全部輸出 . 如果你不想使用它,可以通過靜態配置來選擇:

options = {
    'quiet': ''
    }

pdfkit.from_url('google.com', 'out.pdf', options=options)

由於wkhtmltopdf命令語法，toc和cover選項必須單獨指定。如果你需要覆蓋之前，使用cover_first選項：

toc = {
    'xsl-style-sheet': 'toc.xsl'
}

cover = 'cover.html'

pdfkit.from_file('file.html', options=options, toc=toc, cover=cover)
pdfkit.from_file('file.html', options=options, toc=toc, cover=cover, cover_first=True)

你可以在使用css選項轉換文件或字符串時指定外部css文件.

警告：這是在wkhtmltopdf這個錯誤的解決方法。您應該首先嘗試使用-user-style-sheet選項。.

# Single CSS file
css = 'example.css'
pdfkit.from_file('file.html', options=options, css=css)

# Multiple CSS files
css = ['example.css', 'example2.css']
pdfkit.from_file('file.html', options=options, css=css)

你可以在html中元素標簽中傳遞任何選項:

body = """
    <html>
      <head>
        <meta name="pdfkit-page-size" content="Legal"/>
        <meta name="pdfkit-orientation" content="Landscape"/>
      </head>
      Hello World!
      </html>
    """

pdfkit.from_string(body, 'out.pdf') #with --page-size=Legal and --orientation=Landscape

配置

每個api調用都需要一個可選的配置參數。這應該是pdfkit.configuration（）api調用的一個實例。它將配置選項作為初始參數。可用的選項是:

wkhtmltopdf -二進制 wkhtmltopdf 存放位置. 默認 pdfkit使用系統默認的存放位置
meta_tag_prefix -特定元標記的前綴 - 默認情況下為 pdfkit-

例子- 針對 wkhtmltopdf 不在默認 $PATH路徑下的情況:

config = pdfkit.configuration(wkhtmltopdf='/opt/bin/wkhtmltopdf')
pdfkit.from_string(html_string, output_file, configuration=config)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 如何把html的頁面轉化成excel表格 python下載網頁轉化成pdf Python3.6.3中，functools似乎不能用【JAVA】將PDF轉化成圖片 Python將list中的string批量轉化成int/float Linux安裝python3.6.3 如何將python中的List轉化成dictionary 在python 中如何將 list 轉化成字典（dictionary）在python 中如何將 list 轉化成字典（dictionary） jsp頁面 date轉化成string