Chars74K數據集簡介及手寫字符子數據集相關讀取方法

本文轉載自查看原文 2016-09-12 17:20 5064 計算機視覺相關/ 字符識別/ 計算機視覺/ 圖像處理/ Chars74K/ 手寫字符識別

Chars74K數據集是一個經典的字符識別數據集，主要包括了英文字符與坎那達語（Kannada）字符。數據集一共有74K幅圖像，所以叫Chars74K。

英文數據集依據圖像采集方式分為三個類別：

1. 自然環境下采集的字符圖像數據集；

2. 手寫字符圖像數據集；

3. 計算機不同字體合成的字符圖像數據集。

這里只介紹英文手寫字符數據集。該數據集包含了52個字符類別（A-Z，a-z）和10個數字類別（0-9）一共62個類別，3410副圖像，由55個志願者手寫完成。

該數據集在EnglishHnd.tgz這個文件中（English Hand writing），圖像主要在Img這個文件夾下，按照Samples001-Samples062的命名方式存儲在62個子文件夾下，每個子文件夾有55張圖像，都為PNG格式，分辨率為1200*900，三通道RGB圖像。

一些圖像如圖所示：

數據集作者提供了matlab的讀入方式，在Lists.tgz文件里的English/Hnd文件夾下有個lists_var_size.MAT文件來進行數據讀入，但該文件只是建立了一個結構體（struct），提供了相關信息，圖像的實際數據還是要自己寫代碼讀入。

該結構體載入進來后如下：

數據集作者已經將訓練數據與測試數據分成了30個不同的子集，就是以上的TRNind和TSTind，這里面存儲的是圖像的索引（Index），但這里要注意的是有些訓練數據子集不是930個，后面有些數據是0。

以下的matlab代碼在作者提供的mat文件基礎上，將一個子集的訓練數據、測試數據以及標簽（實際分類）等信息讀入，圖像數據讀入為cell數組，標簽數據讀入為uint16數組（需要注意的是標簽1代表實際的數字0，標簽2代表實際的數字1，依此類推）。

%% read images from chars74k English Hnd dataset.
clc, clear;
% list is a struct, which contains: 
% ALLlabels: [3410*1 uint16]
% ALLnames: [3410*24 char]
% classlabels: [62*1 double]
% classnames: [62*13 char]
% NUMclasses: 62
% TSTind: [1674*30 uint16]
% VALind: []
% TXNind: [930*30 uint16]
% TRNind: [930*30 uint16]
load('lists_var_size.mat');

%% extract training and test datasets
%{
There are 30 patches in the dataset(training & test)
we will select the Nth training and test dataset.
%}
N = 14;
% separats the training & test indexes in dataset
training_index = list.TRNind(:,N);
test_index = list.TSTind(:,N);

% some training patches may have some elements equal to 0
% which we must ignore them.
locate_zero = find(training_index == 0);
training_index(locate_zero) = [];

% the class labels for training set
training_labels = list.ALLlabels(training_index);
% the ground truth labels for test set
test_true_labels = list.ALLlabels(test_index);

%% read image data
for ii = 1:length(training_index)
    img = imread(['../../../English/Hnd/',...
        list.ALLnames(training_index(ii), :), '.png']);
    training_imgs{ii} = img;
% if we want to see the image
%    image(img);
%    pause();
end
for ii = 1:length(test_index)
    img = imread(['../../../English/Hnd/',...
        list.ALLnames(test_index(ii), :), '.png']);
    test_imgs{ii} = img;
% if we want to see the image
%    image(img);
%    pause();
end

Python，OpenCV版本等待更新，或有人願意一起做可以互相交流。

有任何錯誤或不恰當的地方，歡迎指正。

參考鏈接：

http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/

參考文獻：

Teófilo Emídio de Campos, Bodla Rakesh Babu, Manik Varma. Character Recognition in Natural Images.[C]// Visapp 2009 - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, Lisboa, Portugal, February. 2009:273-280.

注：本文原發於七月在線論壇，是計算機視覺公開課的一次作業。

手寫字符識別資源匯總-Chars74K數據集簡介及手寫字符子數據集相關讀取方法

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 基於手寫數字識別數據集的機器學習方法對比研究 TensorFlow——MNIST手寫數據集手寫數字識別-小數據集 LSTM實現手寫數據集手寫數字識別-小數據集 GAN原理手寫數據集生成 VQA背景概括（簡介、方法、數據集）讀取自己的數據集 pyTorch使用mnist數據集實現手寫數字識別 matlab練習程序（神經網絡識別mnist手寫數據集）