python 文字識別之 pytesseract

本文轉載自查看原文 2018-01-19 14:34 2019 python

pytesseract資源

鏈接：https://pan.baidu.com/s/1eTsqhsY 密碼：j0yo

安裝時前面一直next就可以了，直到這一步，勾選Math和Chinese，支持計算和中文

要記住安裝的目錄

我的是 F:\Program Files (x86)\Tesseract-OCR

然后，

在系統變量中添加一個TESSDATA_PREFIX，變量值還是文件路徑
我的是F:\Program Files (x86)\Tesseract-OCR

打開Python安裝路徑：\Python36\Lib\site-packages\pytesseract\pytesseract.py，把路徑改為自己的安裝路徑

運行下面代碼

from PIL import Image
import pytesseract

img = Image.open('aaa.png')
text = pytesseract.image_to_string(img,lang='chi_sim')
print (text)

圖片：1.png

運行結果

結果會有一點出入，需要對現有模型進行訓練才能提高匹配度

chi_sim.traineddata是中文對應的模型，后面會學習對模型的訓練，提供匹配度

關於安裝pytesseract的一些鏈接：

http://blog.csdn.net/cjvs9k/article/details/79044548

http://blog.csdn.net/qiushi_1990/article/details/78041375

http://blog.csdn.net/ztzy520/article/details/53946327

https://www.cnblogs.com/chenbjin/p/4147564.html

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pytesseract+Tesseract-OCR圖片文字識別圖片識別文字 pytesseract安裝及使用 pytesseract提取識別圖片中的文字使用python+pytesseract實現圖片中文字的識別使用python的pytesseract調用谷歌tesseract-ocr識別中英文字符 Python3使用 pytesseract 進行圖片識別 Python 進行 OCR識別 -- pytesseract庫 python pytesseract——3步識別驗證碼的識別入門 python 驗證碼識別庫pytesseract的使用 python3使用pytesseract進行驗證碼識別