OCR識別詳細步驟

本文轉載自查看原文 2019-10-15 15:35 738 圖像處理

一、總體概述
1 Halcon 例子里面其實自帶字符訓練和識別
2 Halcon OCR訓練分SVM和MLP兩種
3 Halcon提供了一些識別模型，但是畢竟自己的項目跟他的不一樣，所以需要自己訓練自己的模型
二、詳細流程
4 下面先以SVM訓練和識別開始（創建訓練文件，訓練，識別）
SVM訓練和識別（訓練自己的0-9和A-Z）
第一步：准備圖片
每個字符對應一個文件夾，為了后期遍歷文件夾方便，文件夾名字以字符直接命名,見下圖。

第二步：創建訓練文件

* 聲明一個字符數組，並且將0-9和A-Z賦值此數組

CharH := []
for i := 0 to 9 by 1
CharH:= chr(round(i + ord('0')))
endfor
for i := 10 to 36-1 by 1
CharH:= chr(round(i-10 + ord('A')))
endfor
NumChar := |CharH|

*聲明一個訓練文件.trf
trainFile := 'ZHANG-Num0-9A-Z.trf'
dev_set_check ('~give_error')
delete_file (TrainFile)
dev_set_check ('give_error')

*遍歷每個文件夾以及每個文件夾里面的字符圖片，將每個文件夾與一個字符關聯起來（這里每個文件夾里面的圖片對應文件夾名“字符”）

此帖售價 10 金幣,已有 106 人購買 [記錄]

for Indexfile: = 0 to |CharH| - 1 by 1
list_files ('Z:\\00Trainlate\\TRIAN20150909\\02pictureTrain_2015-10-26_V1.0\\blackwitewordfirstsub\\checkimage\\test\\char\\'+CharH[Indexfile], ['files','follow_links'], ImageFiles)
tuple_regexp_select (ImageFiles, ['\\.(bmp|jpg)$','ignore_case'], ImageFiles)
for Index : = 0 to |ImageFiles| - 1 by 1
read_image (ImageSige, ImageFiles[Index])
append_ocr_trainf(ImageSige,ImageSige,CharH[Indexfile],TrainFile)
endfor
endfor

第三步：訓練文件（可以選擇SVM訓練或者MLP訓練，根據自己選擇的訓練函數決定），獲得最終模型文件.omc

* ****
* step: read training data
* ****
read_ocr_trainf_names (TrainFile, CharacterNames, CharacterCount)
stop ()
* ****
* step: create and train classifier
* ****
create_ocr_class_svm (8, 10, 'constant', 'default', CharacterNames, 'rbf', 0.02, 0.001, 'one-versus-one', 'normalization', 0, OCRHandle)
* Train the classifier
trainf_ocr_class_svm (OCRHandle, TrainFile, 0.001, 'default')
stop ()
* ****
* step: save classifier
* ****
FontFile : = 'ZHANG-Num0-9A-Z_SVM.omc'
write_ocr_class_svm(OCRHandle,FontFile)
* free memory
clear_ocr_class_svm (OCRHandle)

第四步用自己訓練的.omc 文件進行識別要識別的圖片

<pre name="code" class="cpp">* Read the SVM font file from file 讀取剛剛自己創建的識別模型文件
read_ocr_class_svm ('C:/Users/Public/Documents/MVTec/HALCON-11.0/examples/solution_guide/zhang/ZHANG-Num0-9A-Z_SVM.omc', OCRHandle)
*讀取待識別的圖片
read_image(ImageSige,'C:/Users/CQU/Desktop/QQ截圖20160327192542.jpg')
*有兩個識別函數，他們之間的區別看幫助文檔</span>
do_ocr_single_class_svm(ImageSige, ImageSige, OCRHandle, 1, Class)
* Clear the classifier from memory
clear_ocr_class_svm (OCRHandle)

第五步：檢驗無誤就可以隨意使用.omc 文件了
<span style="font-family: Arial, Helvetica, sans-serif;">
*MLP跟SVM一樣，把對應的函數替換即可，具體教程看其提供的案例</span>

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 使用Pytesseract+Tesseract-OCR識別圖片的簡單步驟 OCR識別 OCR識別 OCR圖片識別引擎 OCR識別-python版（一） JAVA OCR圖片識別 Opencv學習（四）OCR識別項目實戰--OCR識別 OCR 文字識別 Tesseract Ocr文字識別