Python3 centos/windows pytesseract庫的安裝和使用


centos下安裝:

1.安裝依賴
yum install -y autoconf automake libtool libjpeg libpng libtiff zlib libjpeg-devel libpng-devel libtiff-devel zlib-devel

2.安裝Leptonica

wget http://www.leptonica.org/source/leptonica-1.76.0.tar.gz
tar -zxvf leptonica-1.76.0.tar.gz
cd leptonica-1.76.0
./configure
make && make install
# 配置環境變量 etc/profile末尾添加
export LD_LIBRARY_PATH=$LD_LIBRARY_PAYT:/usr/local/lib
export LIBLEPT_HEADERSDIR=/usr/local/include
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
. /etc/profile
3.安裝Tesseract-OCR wget https://github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.3.tar.gz tar -zxvf tesseract-4.0.0-beta.3.tar.gz cd tesseract-4.0.0-beta.3 ./autogen.sh ./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/include make && sudo make install
# 環境變量
TESSDATA_PREFIX=/usr/local/share/tessdata # linux

windows下安裝:

1.安裝tessersct
https://digi.bib.uni-mannheim.de/tesseract/
2.環境變量(語言庫位置)
TESSDATA_PREFIX=C:\Program Files (x86)\Tesseract-OCR\tessdata # windows

語言庫下載:

https://github.com/tesseract-ocr/tesseract/wiki/Data-Files

windows 放在安裝目錄的tessdata下
linux 放在/usr/local/share/tessdata,/usr/local/bin/tesseract --list-langs 命令可檢測已導入的語言包

python庫安裝:

pip3 install pillow  # pytesseract依賴
pip3 install pytesseract

使用:

import pytesseract
from PIL import Image

# pytesseract.pytesseract.tesseract_cmd = 'C://Program Files (x86)/Tesseract-OCR/tesseract.exe'  # windows下,指向tesseract.exe
pytesseract.pytesseract.tesseract_cmd = '/usr/local/bin/tesseract' # linux下,指向tesseract
res = pytesseract.image_to_string(Image.open('xx.jpg'),lang='chi_sim')  # chi_sim 中文

print(res)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM