Tesseract3.01在VS2008下面的使用

本文轉載自查看原文 2012-06-06 02:16 4368 圖像處理/ Tesseract

通過在Tesseract3+vs2008編譯后獲得的Tesseract 鏈接庫可以就可以在C語言中調用函數，實現圖片的識別，英文字符識別效果還可以，中文的效果就很差了，而且時間很久，具體過程可以參考文章：Tesseract3.01 OCR在VS2008環境下的編譯使用（1）

本文主要對Tesseract3在C語言中的使用過程進行簡單的測試。

Tesseract的主要函數成員：

初始化函數

(1) int Init(const char* datapath, const char* language,  char **configs, int configs_size, bool configs_global_only);
(2) int Init(const char* datapath, const char* language) { return Init(datapath, language, 0, 0, false);  }
(3) int InitLangMod(const char* datapath, const char* language);
(4) int InitWithoutLangModel(const char* datapath, const char* language);

函數主要參數：datapath表示語言包的路徑，language:語言使用ISO 639-3 string或者默認使用英文（NULL），比如中文為”chi_sim”,英文為默認(NULL)或者寫“eng”，其他的參數可采用默認；注意：語言包必須有一種，在設定的路徑下，不然會出現錯誤，語言包的下載就百度吧，

上面幾個函數調用其中一個即可。

參數設置函數：

 （1）void SetPageSegMode(PageSegMode mode);  //設置版面分割的模式，默認為PSM_SINGLE_BLOCK模式

主要的模式如下：

enum PageSegMode {
  PSM_OSD_ONLY,       ///< Orientation and script detection only.
  PSM_AUTO_OSD,       ///< Automatic page segmentation with orientation and
                      ///< script detection. (OSD)
  PSM_AUTO_ONLY,      ///< Automatic page segmentation, but no OSD, or OCR.
  PSM_AUTO,           ///< Fully automatic page segmentation, but no OSD.
  PSM_SINGLE_COLUMN,  ///< Assume a single column of text of variable sizes.
  PSM_SINGLE_BLOCK_VERT_TEXT,  ///< Assume a single uniform block of vertically
                               ///< aligned text.
  PSM_SINGLE_BLOCK,   ///< Assume a single uniform block of text. (Default.)
  PSM_SINGLE_LINE,    ///< Treat the image as a single text line.
  PSM_SINGLE_WORD,    ///< Treat the image as a single word.
  PSM_CIRCLE_WORD,    ///< Treat the image as a single word in a circle.
  PSM_SINGLE_CHAR,    ///< Treat the image as a single character.

  PSM_COUNT           ///< Number of enum entries.
};

（2）bool SetVariable(const char* name, const char* value);

我見到的用法如下，但是我測試的時候，覺得沒什么效果：

//只識別bcdefghijklmnopqrstuvwsyz
SetVariable("tessedit_char_whitelist", "abcdefghijklmnopqrstuvwsyz);
//忽略ZXY
SetVariable("tessedit_char_blacklist", "xyz"); 
//只識別數字
SetVariable("classify_bln_numeric_mode", "123456789");


圖片輸入函數

(1) char* TesseractRect(const unsigned char* imagedata, int bytes_per_pixel, int bytes_per_line, 
                        int left, int top, int width, int height);

TesseractRect函數:輸入需要處理的圖片，並且設定區域，imagedata：8位或者24位，32位彩色圖片，其他調色板的圖片需轉換為24位圖像
bytes_per_pixel：每像素的字節數；bytes_per_line,每行的字節數（對齊后的），其他的不解釋

這個函數也可以拆分為一下幾個函數：

(2) void SetImage(const unsigned char* imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);
(3)  void SetRectangle(int left, int top, int width, int height);

SetImage函數：輸入需要處理的圖片，和TesseractRect的參數解釋相同，注意的是這個函數會修改輸入的圖像

SetRectangle：設置需要處理的區域

獲得識別結果

（4）char* GetUTF8Text();

獲取文字圖像中的文字信息，UTF8格式，API上說需要對獲取的char*進行delete，但是我在測試的delete[]會出現錯誤。

對字符信任度評價

（5）int MeanTextConf();   //獲取圖像中文字識別結果的平均可信任度,大小為0~100
（6）int* AllWordConfidences(); //獲取每個字符的可信任度，與GetUTF8Text獲取的字符對應，值為0~100之間

個人覺得這類函數也是蠻重要的一類，可以對識別的結果做出大致的評價，對於評價較差的，可以另作處理，我測試的時候，做的好的識別，信任度識別都在80以上，做的不好的，就在80一下，還是可以大致說明識別結果的大致情況。

結束函數：

（7）void Clear(); //清tesseract的內部圖片空間以及識別結果，可以多次使用
（8）void End();  //釋放tesseract的所有內存，釋放API

記得釋放，尤其是循環使用的時候，使用clear釋放上一次操作的空間。

tesseract也提供一些輸出中間過程的函數，我沒做研究，沒有測試，API說明如下：

中間函數 
 /*在SetImage或者TesseractRect之后，獲取內部閾值后圖像的一個COPY*/
  Pix* GetThresholdedImage();

  /*獲得版面分析的結果（layout analysis） 在分析之前或者之后調用.*/
  Boxa* GetRegions(Pixa** pixa);

  /** * Get the textlines as a leptonica-style * Boxa, Pixa pair, in reading order. * Can be called before or after Recognize. * If blockids is not NULL, the block-id of each line is also returned * as an array of one element per line. delete [] after use. */
  Boxa* GetTextlines(Pixa** pixa, int** blockids);

  /** * Get the words as a leptonica-style * Boxa, Pixa pair, in reading order. * Can be called before or after Recognize. */
  Boxa* GetWords(Pixa** pixa);

  // Gets the individual connected (text) components (created
  // after pages segmentation step, but before recognition)
  // as a leptonica-style Boxa, Pixa pair, in reading order.
  // Can be called before or after Recognize.
  // Note: the caller is responsible for calling boxaDestroy()
  // on the returned Boxa array and pixaDestroy() on cc array.
  Boxa* GetConnectedComponents(Pixa** cc);

  // Get the given level kind of components (block, textline, word etc.) as a
  // leptonica-style Boxa, Pixa pair, in reading order.
  // Can be called before or after Recognize.
  // If blockids is not NULL, the block-id of each component is also returned
  // as an array of one element per component. delete [] after use.
  Boxa* GetComponentImages(PageIteratorLevel level,
                           Pixa** pixa, int** blockids);

上面的函數足以完成圖像字符的識別，但是tesseract也提供了其他函數，比如圖像讀取，對識別的字符可信性進行評估以及獲取識別過程中的

中間圖像

讀取圖像函數

(1) INT8 IMAGE::read_header ( const char *  name  );
(2) inT32 check_legal_image_size(                     //get rest of image
inT32 x,                      //x size required
inT32 y,                    //ysize required
inT8 bits_per_pixel  //bpp required
);
(3)inT8 read(inT32 buflines);

參考別人的例子的時候，會使用這個函數讀取函數，但是我在使用的時候，發現3.0的版本並沒發現IMAGE類里面的read函數和

read_header函數，可能是我用的文件問題吧，但是我本省也不想使用這個類，更想使用opencv完成圖像的讀取和預處理的工作，這里不多做說明了，如果哪位知道是哪里問題，可以告訴我哦。。。不適用提供的函數，使用OPENCV其實也很方便，不需要做任何轉換，看下面的代碼：

	IplImage *iplimg =  NULL;
	iplimg = cvLoadImage("1.jpg");
	tesseract::TessBaseAPI  api;
	//api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwsyzABCDEFGHIJKLMNOPQRSTUVWXYZ");
	//api.SetVariable("classify_bln_numeric_mode", "123456789");
	api.Init("C:\\BuildFolder\\tesseract-3.01\\tessdata", NULL);
	//api.SetPageSegMode(PSM_SINGLE_BLOCK);
	api.SetImage((unsigned char*)(iplimg->imageData), 
						iplimg->width, iplimg->height,iplimg->nChannels  , iplimg->widthStep);//設置圖像
	char* text = api.GetUTF8Text();//識別圖像中的文字

這里是我的整個簡單測試代碼，程序的設置，見我令我一篇博文Tesseract3.01 OCR在VS2008環境下的編譯使用（1）

測試代碼
#include "stdafx.h"

#include "allheaders.h"
#include "baseapi.h"
#include "resultiterator.h"
#include "strngs.h"
#include "blobs.h"

#include "cv.h"
#include "highgui.h"
#include "cxcore.h"

#include "stdlib.h"
using namespace  tesseract;

int _tmain(int argc, _TCHAR* argv[])

{
	STRING text_out;

	IplImage *iplimg =  NULL;
	iplimg = cvLoadImage("1.jpg");
	tesseract::TessBaseAPI  api;
	//api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwsyzABCDEFGHIJKLMNOPQRSTUVWXYZ");
	//SetVariable("tessedit_char_blacklist", "xyz"); to ignore x, y and z.
	//api.SetVariable("classify_bln_numeric_mode", "123456789");

	api.Init("C:\\BuildFolder\\tesseract-3.01\\tessdata", NULL);
	//api.SetPageSegMode(PSM_SINGLE_BLOCK);
	api.SetImage((unsigned char*)(iplimg->imageData), 
						iplimg->width, iplimg->height,iplimg->nChannels  , iplimg->widthStep);//設置圖像
	char* text = api.GetUTF8Text();//識別圖像中的文字

	printf("%s\n","獲得的結果");
	printf("%s\n",text);
	FILE* fout = fopen("txt_file.TXT", "w");


	//fwrite(text_out.string(), 1, text_out.length(), fout);//將識別結果寫入輸出文件

	fprintf(fout,"%s\n","獲得的結果");
	fprintf(fout,"%s\n",text);
	fclose(fout);

	
	UINT d = api.MeanTextConf();
	fprintf(fout,"%d\n",d);
	printf("%d\n",d);

	int *gg = api.AllWordConfidences();

	while (*gg != '\0')
	{
		printf("%d\n",*gg);
		gg ++ ;
	}


	getchar();

	api.Clear();
	api.End();
	return 0;

}

作者：細雨淅淅

轉載請注明地址：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Windows下使用VS2008編譯OpenCV 2.1 添加Intel TBB和Python支持 main函數帶參數在VS2008下調試方法 vc(VS2008，UNICODE)下用gsoap調用WCF服務 VC++ : VS2008 使用ATL開發COM組件 vs2008中使用gdi+的設置如何使用VS2008打開VS2010的解決方案 VS2008完全卸載工具 VS2008皮膚更換詳細教程 VS2008創建MFC程序 VS2008 MFC 類向導