參考地址:https://github.com/jsvine/pdfplumber
簡單的pdf轉換文本:
import pdfplumber
with pdfplumber.open(path) as pdf:
for page in pdf.pages:
content = page.extract_text()
print(content)
注意:只能轉換pdf文本格式,如果pdf文件中是圖片則返回None。
將pdf轉換成圖片,錯誤
the first is ImageMagick(32bit or 64bit) must be accord with the python(32bit or 64bit), even in the 64bit OS. If not, there will be a ImageMagick not installed mistake.
The second is that it need the ghostscript otherwise ImageMagick wouldn’t work properly.