Python圖形驗證碼識別

本文轉載自查看原文 2019-06-12 15:07 973 爬蟲/ Python

一，OCR　　

　　OCR,即Optical Character Recognition，光學字符識別，通過掃描字符，分析形狀，然后將其翻譯成電子文本的過程。tesserocr是Python的一個OCR識別庫，但其實是對tesseract做的一層封裝。安裝tesserocr之前需要先按照tesseract。

二，准備工具

　　安裝庫tesserocr，windows下安裝前需要下載安裝tesseract，

　　tesseract下載地址：https://digi.bib.uni-mannheim.de/tesseract/

　　圖中有很多版本，其中帶dev的為開發版本，不帶dev的為穩定版本，推薦下載穩定版本。

　　安裝時勾選Additional language data選項來安裝OCR識別支持的語言包，可以識別多國語言。然后一直點擊Next即可。

　　接下來，安裝tesserocr即可：pip3 install tesserocr pillow

　　whl安裝包下載鏈接：https://github.com/simonflueckiger/tesserocr-windows_build/releases

　　選擇合適的版本下載運行

　　pip3 install tesserocr-2.2.2-cp36-cp36m-win_amd64.whl

三、代碼

import tesserocr
from PIL import Image

image = Image.open('code.png')
res = tesserocr.image_to_text(image)
print(image, res)
# 二值化
image = image.convert('L')
threshold = 127
table = []
for i in range(256):
    if i < threshold:
        table.append(0)
    else:
        table.append(1)

image = image.point(table, '1')
image.show()

result = tesserocr.image_to_text(image)
print(result)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Python驗證碼識別 Python驗證碼識別 Python實現圖片驗證碼識別 python 識別圖片驗證碼 python實現圖文驗證碼識別 python驗證碼自動識別 Python識別驗證碼的開源工具 python驗證碼簡單識別 Python驗證碼通過pytesser識別 python 驗證碼識別示例（三）簡單驗證碼識別