1 理論基礎
學習Eigen人臉識別算法需要了解一下它用到的幾個理論基礎,現總結如下:
1.1 協方差矩陣
首先需要了解一下公式:
共公式可以看出:均值描述的是樣本集合的平均值,而標准差描述的則是樣本集合的各個樣本點到均值的距離之平均。以一個國家國民收入為例,均值反映了平均收入,而均方差/方差則反映了貧富差距,如果兩個國家國民收入均值相等,則標准差越大說明國家的國民收入越不均衡,貧富差距較大。以上公式都是用來描述一維數據量的,把方差公式推廣到二維,則可得到協方差公式:
協方差表明了兩個隨機變量之間的相關性,值為正說明兩者是正相關的,值為負說明兩者是負相關的,值為零說明兩者不相關,舉一個簡單的小例子,假設一個人用4個維度身高、體重、距離屋頂的高度、每天畫畫的時間來表示:身高取樣X=[1 2 3 4 5 6 7 8 9],體重取樣Y=[11 12 13 14 15 16 17 18 19],距離屋頂的高度取樣Z=[9 8 7 6 5 4 3 2 1],每天畫畫時間L=[1 1 1 1 1 1 1 1 1],則有cov(X,Y)=7.5,cov(X,Z)=-7.5,cov(X,L)=0,結果很明顯X和Y協方差為正數兩者正相關,X和Z協方差為負數兩者負相關,X和L協方差為0,說明它們不相關。以上例子每一個隨機變量都可以表示一個維度,我們計算了部分維度之間的協方差,計算所有維度之間的協方差並組織成矩陣的形式,就有了協方差矩陣的概念:Cnxn=[ci,j]=[cov(Dimi,Dimj)] i,j=1,2,…,n,Dimi表示第i個維度向量。以Matlab協方差矩陣為例,將X,Y,Z,L分別作為1,2,3,4個維度,則有c1,1=7.5,c1,2=7.5,c1,3=-7.5,c1,4=7.5……,所以協方差矩陣為:
在Matlab中可以把矩陣的每行看做是4個隨機變量的一組取樣樣本,每列看做是一個維度,則可以直接用con函數求得4個維度的協方差矩陣:
1.2 Jacobi迭代法求對稱矩陣特征向量及特征值
雅可比迭代法的基本思想是:通過一組平面旋轉變換(相似正交變換)化對稱矩陣A為對角矩陣,進而求出A的特征值與特征向量。由線性代數理論可知:若矩陣A是實對稱矩陣,則一定存在正交矩陣U,使得UT*A*U=D,其中D對角矩陣,其主對角線元素λi是A的特征值,正交矩陣U的第i列是A對應特征值λi的特征向量。於是求對稱矩陣A的特征值問題轉化為尋找正交矩陣U,使得UT*A*U為對角矩陣,這個問題的困難在於如何構造U,為此我們先看一下平面上的旋轉變換:
則有:
其中:
上述推導其實說明了一種構造正交矩陣P,並使得PT*A*P為對角矩陣的方法,可以將這種方法推廣到nxn對角矩陣,首先引入n階旋轉矩陣(Givens矩陣)的概念:
平面旋轉矩陣有如下性質:
(1)Upq為正交矩陣,即UpqT*Upq=E
(2)UTAU=B仍為對稱矩陣,且B與A有相同的特征值
Jacobi迭代法,在每一次迭代時都是進行一次(2)中的轉換,這里p、q分別是前一次的迭代矩陣A的非主對角線上絕對值最大元素的行列號,變換后元素值可以由以下公式求出:
由公式可以看出轉換后矩陣相比原矩陣只是在p,q行和列的元素發生了改變,旋轉角的計算過程和2維時一樣,其意義是使得apq和aqp值為零,這樣每次迭代都使得非對角線上絕對值最大的元素變為零,所以整個迭代的過程就是使對角線外元素逐步逼近於零,這是對角線上的元素即為原對稱矩陣的特征值λi。在進行Jacobi迭代時,假如i次迭代時旋轉矩陣為Ui,每次迭代對單位矩陣I依次左乘Ui,最終迭代結束后可得矩陣D=Uk…U2U1I,這里k為迭代次數,則可以證明D的列向量即為特征值λi對應的特征向量,證明如下:
上述推導過程中di為矩陣D的i列表示的列向量,由最后的等式及特征值定義,可以得知λi是A的特征值,di為對應的特征向量。
2 OpenCV源碼解析
2.1 關鍵函數
(1)void reduce(InputArray src, OutputArray dst, int dim, int rtype, int dtype=-1)
其英文注釋:transforms 2D matrix to 1D row or column vector by taking sum, minimum, maximum or mean value over all the rows.
其英文注釋不太准確,函數的作用其實是:將2維矩陣轉換為1維行向量或列向量,如轉換為行向量,則每列處的值為原矩陣對應列所有值的和,最小值,最大值,平均值;如轉換為列向量,則每行處的值為原矩陣對應行所有值的和。該函數參數意義如下:
src: 原矩陣
dst: 目的向量
dim: 指明處理后向量是行向量還是列向量,0原矩陣被處理成行向量,否則原矩陣被處理成列向量
op: 取值為CV_REDUCE_SUM,CV_REDUCE_MAX,CV_REDUCE_MIN,CV_REDUCE_AVG之一
dtype: 目的向量類型
(2)void gemm(InputArray src1, InputArray src2, double alpha, InputArray src3, double gamma, OutputArray dst, int flags=0)
其英文注釋:implements generalized matrix product algorithm GEMM from BLAS.
函數的作用:實現廣義矩陣乘法,只對最后一個參數進行說明
flags: 取值為GEMM_1_T,GEMM_2_T,GEMM_3_T之1或者它們的組合,例如取值為GEMM_1_T則進行乘法之前對src1進行轉置,所有函數作用可由以下公式來說明:
dst=alpha*op(src1)*op(src2)+gamma*op(src3),其中op(X)是X還是XT由flags確定。
(3)void mulTransposed( InputArray src, OutputArray dst, bool aTa, InputArray delta=noArray(), double scale=1, int dtype=-1 )
其英文注釋:multiplies matrix by its transposition from the left or from the right.
函數的作用:矩陣左乘或右乘其轉置矩陣,參數意思如下:
src: 原矩陣
dst: 目的矩陣
ata: 乘法順序,true AT*A false A*AT
delta:在進行乘法前src先減去該數組
scale:乘法之后對結果進行scale倍縮放
dtype:目的矩陣類型
當ata為真時可用公式 dst=(src-delta)T*(src-delta)*scale 來說明函數的作用,該函數內部調用了函數(2)
(4)void calcCovarMatrix( InputArray samples, OutputArray covar, OutputArray mean, int flags, int ctype=CV_64F)
其英文注釋:computes covariation matrix of a set of samples
函數作用:計算矩陣行向量或列向量的協方差矩陣,該函數中會調用函數(3)來實現相應功能
(5)bool eigen(InputArray src, OutputArray eigenvalues, OutputArray eigenvectors, int lowindex=-1, int highindex=-1)
其英文解釋:finds eigenvalues and eigenvectors of a symmetric matrix
函數作用:求對稱矩陣的特征值和特征向量,在該函數中會利用Jacobi方法來求對稱矩陣的特征值和特征向量
2.2 主要過程
特征臉EigenFace的思想是把人臉從像素空間變換到另一個空間,在另一個空間中做相似性計算,EigenFace選擇的空間變換方法是PCA,就是大名鼎鼎的主成分分析。EigenFace方法利用PCA得到人臉分布的主成分,具體實現是對訓練集中的所有人臉圖像的協方差矩陣進行求特征值,特征值對應的特征向量就是所謂的“特征臉”,每個特征向量描述人臉的一種變化或者特征,所以每個人臉都可以表示為這些特征臉的線性組合。下面結合以AT&T人臉庫(40個人每個人包含10個表情臉圖像,共400個臉部圖像,每個圖像分辨率為92x112),取其中399個人臉為樣本庫,最后1個為待識別人臉,給出基於Eigen特征臉的人臉識別實現過程:
(1)將訓練集中的每一個人臉圖像數據都拉長成一行,並將他們組合在一起形成一個大矩陣A,則A的大小為399x10304,即399行10304列。
(2)將399個人臉每個人臉對應的維度數據相加,然后求平均值,得到平均值向量Mean1x10304,將矩陣A的每一行都減去平均值向量得到差值矩陣B。
(3)計算協方差矩陣C=B*BT,C的維度是399x399,再對C求特征值λi,及特征向量ei,0<=i<399。
(4)上一步驟中其實並不是真正的人臉取樣集協方差矩陣,因為人臉取樣的維度是10304,而協方差矩陣反應的是各個維度之前的相關性,所以人臉取樣集真正的協方差矩陣是C'=CT=BT*B,如果vi是C'的第i個特征向量,可以證明λi同樣是C'的特征值,且vi=BT*ei(vi是10304行列向量),證明如下:
C*ei=λi*ei => B*BT*ei=λi*ei => BT*B*BT*ei=λi*BT*ei => C'*vi=λi*vi
特征向量vi即為“特征臉”,所有特征向量組成特征向量矩陣V10304*399,則對於任意人臉向量α,將它與特征向量矩陣V相乘,將得到向量α在各個特征向量的投影,即α*V所得向量的每一個元素為α在對應“特征臉”的投影,在進行識別時,先求得待識別人臉向量在“特征臉”的投影向量,之后和每個樣本臉的投影向量進行相似度比較,相似度最低者為最佳匹配。
2.3 核心源碼
代碼取自Opencv2.4.9

1 void Eigenfaces::train(InputArrayOfArrays _src, InputArray _local_labels) { 2 if(_src.total() == 0) { 3 string error_message = format("Empty training data was given. You'll need more than one sample to learn a model."); 4 CV_Error(CV_StsBadArg, error_message); 5 } else if(_local_labels.getMat().type() != CV_32SC1) { 6 string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _local_labels.type()); 7 CV_Error(CV_StsBadArg, error_message); 8 } 9 // make sure data has correct size 10 if(_src.total() > 1) { 11 for(int i = 1; i < static_cast<int>(_src.total()); i++) { 12 if(_src.getMat(i-1).total() != _src.getMat(i).total()) { 13 string error_message = format("In the Eigenfaces method all input samples (training images) must be of equal size! Expected %d pixels, but was %d pixels.", _src.getMat(i-1).total(), _src.getMat(i).total()); 14 CV_Error(CV_StsUnsupportedFormat, error_message); 15 } 16 } 17 } 18 // get labels 19 Mat labels = _local_labels.getMat(); 20 // observations in row 21 Mat data = asRowMatrix(_src, CV_64FC1); 22 23 // number of samples 24 int n = data.rows; 25 // assert there are as much samples as labels 26 if(static_cast<int>(labels.total()) != n) { 27 string error_message = format("The number of samples (src) must equal the number of labels (labels)! len(src)=%d, len(labels)=%d.", n, labels.total()); 28 CV_Error(CV_StsBadArg, error_message); 29 } 30 // clear existing model data 31 _labels.release(); 32 _projections.clear(); 33 // clip number of components to be valid 34 if((_num_components <= 0) || (_num_components > n)) 35 _num_components = n; 36 37 // perform the PCA 38 PCA pca(data, Mat(), CV_PCA_DATA_AS_ROW, _num_components); 39 // copy the PCA results 40 _mean = pca.mean.reshape(1,1); // store the mean vector 41 _eigenvalues = pca.eigenvalues.clone(); // eigenvalues by row 42 transpose(pca.eigenvectors, _eigenvectors); // eigenvectors by column 43 // store labels for prediction 44 _labels = labels.clone(); 45 // save projections 46 for(int sampleIdx = 0; sampleIdx < data.rows; sampleIdx++) { 47 Mat p = subspaceProject(_eigenvectors, _mean, data.row(sampleIdx)); 48 _projections.push_back(p); 49 } 50 }
38行的PCA類中實現了求樣本矩陣的協方差矩陣、求協方差矩陣特征向量等核心功能,47行_mean為人臉平均值向量,該行其實是求每一個人臉向量減去平均值向量在“特征臉”集上的投影向量。

1 PCA& PCA::operator()(InputArray _data, InputArray __mean, int flags, int maxComponents) 2 { 3 Mat data = _data.getMat(), _mean = __mean.getMat(); 4 int covar_flags = CV_COVAR_SCALE; 5 int i, len, in_count; 6 Size mean_sz; 7 8 CV_Assert( data.channels() == 1 ); 9 if( flags & CV_PCA_DATA_AS_COL ) 10 { 11 len = data.rows; 12 in_count = data.cols; 13 covar_flags |= CV_COVAR_COLS; 14 mean_sz = Size(1, len); 15 } 16 else 17 { 18 len = data.cols; 19 in_count = data.rows; 20 covar_flags |= CV_COVAR_ROWS; 21 mean_sz = Size(len, 1); 22 } 23 24 int count = std::min(len, in_count), out_count = count; 25 if( maxComponents > 0 ) 26 out_count = std::min(count, maxComponents); 27 28 // "scrambled" way to compute PCA (when cols(A)>rows(A)): 29 // B = A'A; B*x=b*x; C = AA'; C*y=c*y -> AA'*y=c*y -> A'A*(A'*y)=c*(A'*y) -> c = b, x=A'*y 30 if( len <= in_count ) 31 covar_flags |= CV_COVAR_NORMAL; 32 33 int ctype = std::max(CV_32F, data.depth()); 34 mean.create( mean_sz, ctype ); 35 36 Mat covar( count, count, ctype ); 37 38 if( _mean.data ) 39 { 40 CV_Assert( _mean.size() == mean_sz ); 41 _mean.convertTo(mean, ctype); 42 covar_flags |= CV_COVAR_USE_AVG; 43 } 44 45 calcCovarMatrix( data, covar, mean, covar_flags, ctype ); 46 eigen( covar, eigenvalues, eigenvectors ); 47 48 if( !(covar_flags & CV_COVAR_NORMAL) ) 49 { 50 // CV_PCA_DATA_AS_ROW: cols(A)>rows(A). x=A'*y -> x'=y'*A 51 // CV_PCA_DATA_AS_COL: rows(A)>cols(A). x=A''*y -> x'=y'*A' 52 Mat tmp_data, tmp_mean = repeat(mean, data.rows/mean.rows, data.cols/mean.cols); 53 if( data.type() != ctype || tmp_mean.data == mean.data ) 54 { 55 data.convertTo( tmp_data, ctype ); 56 subtract( tmp_data, tmp_mean, tmp_data ); 57 } 58 else 59 { 60 subtract( data, tmp_mean, tmp_mean ); 61 tmp_data = tmp_mean; 62 } 63 64 Mat evects1(count, len, ctype); 65 gemm( eigenvectors, tmp_data, 1, Mat(), 0, evects1, 66 (flags & CV_PCA_DATA_AS_COL) ? CV_GEMM_B_T : 0); 67 eigenvectors = evects1; 68 69 // normalize eigenvectors 70 for( i = 0; i < out_count; i++ ) 71 { 72 Mat vec = eigenvectors.row(i); 73 normalize(vec, vec); 74 } 75 } 76 77 if( count > out_count ) 78 { 79 // use clone() to physically copy the data and thus deallocate the original matrices 80 eigenvalues = eigenvalues.rowRange(0,out_count).clone(); 81 eigenvectors = eigenvectors.rowRange(0,out_count).clone(); 82 } 83 return *this; 84 }
45行求樣本矩陣的協方差矩陣,46行求協方差矩陣的特征值及特征向量。

1 void Eigenfaces::predict(InputArray _src, int &minClass, double &minDist) const { 2 // get data 3 Mat src = _src.getMat(); 4 // make sure the user is passing correct data 5 if(_projections.empty()) { 6 // throw error if no data (or simply return -1?) 7 string error_message = "This Eigenfaces model is not computed yet. Did you call Eigenfaces::train?"; 8 CV_Error(CV_StsError, error_message); 9 } else if(_eigenvectors.rows != static_cast<int>(src.total())) { 10 // check data alignment just for clearer exception messages 11 string error_message = format("Wrong input image size. Reason: Training and Test images must be of equal size! Expected an image with %d elements, but got %d.", _eigenvectors.rows, src.total()); 12 CV_Error(CV_StsBadArg, error_message); 13 } 14 // project into PCA subspace 15 Mat q = subspaceProject(_eigenvectors, _mean, src.reshape(1,1)); 16 minDist = DBL_MAX; 17 minClass = -1; 18 for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) { 19 double dist = norm(_projections[sampleIdx], q, NORM_L2); 20 if((dist < minDist) && (dist < _threshold)) { 21 minDist = dist; 22 minClass = _labels.at<int>((int)sampleIdx); 23 } 24 } 25 }
15行求待識別人臉向量減去人臉平均值向量在“特征臉”集上的投影向量X,19行求X與人臉樣本投影向量的歐幾里得距離(把此距離作為人臉相似度),20~23行取最小距離為識別結果。
3 示例代碼
最后給出Eigen人臉識別的示例代碼,代碼中仍使用AT&T人臉庫,其下載地址見上一篇隨筆。
1 #include "opencv2/core/core.hpp" 2 #include "opencv2/highgui/highgui.hpp" 3 #include "opencv2/contrib/contrib.hpp" 4 5 #define CV_VERSION_ID CVAUX_STR(CV_MAJOR_VERSION) CVAUX_STR(CV_MINOR_VERSION) CVAUX_STR(CV_SUBMINOR_VERSION) 6 7 #ifdef _DEBUG 8 #define cvLIB(name) "opencv_" name CV_VERSION_ID "d" 9 #else 10 #define cvLIB(name) "opencv_" name CV_VERSION_ID 11 #endif 12 13 #pragma comment( lib, cvLIB("core") ) 14 #pragma comment( lib, cvLIB("imgproc") ) 15 #pragma comment( lib, cvLIB("highgui") ) 16 #pragma comment( lib, cvLIB("flann") ) 17 #pragma comment( lib, cvLIB("features2d") ) 18 #pragma comment( lib, cvLIB("calib3d") ) 19 #pragma comment( lib, cvLIB("gpu") ) 20 #pragma comment( lib, cvLIB("legacy") ) 21 #pragma comment( lib, cvLIB("ml") ) 22 #pragma comment( lib, cvLIB("objdetect") ) 23 #pragma comment( lib, cvLIB("ts") ) 24 #pragma comment( lib, cvLIB("video") ) 25 #pragma comment( lib, cvLIB("contrib") ) 26 #pragma comment( lib, cvLIB("nonfree") ) 27 28 #include <iostream> 29 #include <fstream> 30 #include <sstream> 31 32 using namespace cv; 33 using namespace std; 34 35 static Mat toGrayscale(InputArray _src) { 36 Mat src = _src.getMat(); 37 // only allow one channel 38 if(src.channels() != 1) { 39 CV_Error(CV_StsBadArg, "Only Matrices with one channel are supported"); 40 } 41 // create and return normalized image 42 Mat dst; 43 cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1); 44 return dst; 45 } 46 47 static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') { 48 std::ifstream file(filename.c_str(), ifstream::in); 49 if (!file) { 50 string error_message = "No valid input file was given, please check the given filename."; 51 CV_Error(CV_StsBadArg, error_message); 52 } 53 string line, path, classlabel; 54 while (getline(file, line)) { 55 stringstream liness(line); 56 getline(liness, path, separator); 57 getline(liness, classlabel); 58 if(!path.empty() && !classlabel.empty()) { 59 images.push_back(imread(path, 0)); 60 labels.push_back(atoi(classlabel.c_str())); 61 } 62 } 63 } 64 65 int main(int argc, const char *argv[]) { 66 // Check for valid command line arguments, print usage 67 // if no arguments were given. 68 if (argc != 2) { 69 cout << "usage: " << argv[0] << " <csv.ext>" << endl; 70 exit(1); 71 } 72 73 // Get the path to your CSV. 74 string fn_csv = string(argv[1]); 75 // These vectors hold the images and corresponding labels. 76 vector<Mat> images; 77 vector<int> labels; 78 // Read in the data. This can fail if no valid 79 // input filename is given. 80 try { 81 read_csv(fn_csv, images, labels); 82 } catch (cv::Exception& e) { 83 cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl; 84 // nothing more we can do 85 exit(1); 86 } 87 // Quit if there are not enough images for this demo. 88 if(images.size() <= 1) { 89 string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!"; 90 CV_Error(CV_StsError, error_message); 91 } 92 // Get the height from the first image. We'll need this 93 // later in code to reshape the images to their original 94 // size: 95 int height = images[0].rows; 96 // The following lines simply get the last images from 97 // your dataset and remove it from the vector. This is 98 // done, so that the training data (which we learn the 99 // cv::FaceRecognizer on) and the test data we test 100 // the model with, do not overlap. 101 Mat testSample = images[images.size() - 1]; 102 int testLabel = labels[labels.size() - 1]; 103 images.pop_back(); 104 labels.pop_back(); 105 // The following lines create an Eigenfaces model for 106 // face recognition and train it with the images and 107 // labels read from the given CSV file. 108 // This here is a full PCA, if you just want to keep 109 // 10 principal components (read Eigenfaces), then call 110 // the factory method like this: 111 // 112 // cv::createEigenFaceRecognizer(10); 113 // 114 // If you want to create a FaceRecognizer with a 115 // confidennce threshold, call it with: 116 // 117 // cv::createEigenFaceRecognizer(10, 123.0); 118 // 119 Ptr<FaceRecognizer> model = createEigenFaceRecognizer(); 120 model->train(images, labels); 121 // The following line predicts the label of a given 122 // test image: 123 int predictedLabel = model->predict(testSample); 124 // 125 // To get the confidence of a prediction call the model with: 126 // 127 // int predictedLabel = -1; 128 // double confidence = 0.0; 129 // model->predict(testSample, predictedLabel, confidence); 130 // 131 string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel); 132 cout << result_message << endl; 133 // Sometimes you'll need to get/set internal model data, 134 // which isn't exposed by the public cv::FaceRecognizer. 135 // Since each cv::FaceRecognizer is derived from a 136 // cv::Algorithm, you can query the data. 137 // 138 // First we'll use it to set the threshold of the FaceRecognizer 139 // to 0.0 without retraining the model. This can be useful if 140 // you are evaluating the model: 141 // 142 model->set("threshold", 0.0); 143 // Now the threshold of this model is set to 0.0. A prediction 144 // now returns -1, as it's impossible to have a distance below 145 // it 146 predictedLabel = model->predict(testSample); 147 cout << "Predicted class = " << predictedLabel << endl; 148 // Here is how to get the eigenvalues of this Eigenfaces model: 149 Mat eigenvalues = model->getMat("eigenvalues"); 150 // And we can do the same to display the Eigenvectors (read Eigenfaces): 151 Mat W = model->getMat("eigenvectors"); 152 // From this we will display the (at most) first 10 Eigenfaces: 153 for (int i = 0; i < min(10, W.cols); i++) { 154 string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i)); 155 cout << msg << endl; 156 // get eigenvector #i 157 Mat ev = W.col(i).clone(); 158 // Reshape to original size & normalize to [0...255] for imshow. 159 Mat grayscale = toGrayscale(ev.reshape(1, height)); 160 // Show the image & apply a Jet colormap for better sensing. 161 Mat cgrayscale; 162 applyColorMap(grayscale, cgrayscale, COLORMAP_JET); 163 imshow(format("%d", i), cgrayscale); 164 } 165 waitKey(0); 166 167 return 0; 168 }
程序運行結果及用偽彩色圖像顯示的前10個特征臉,如圖所示:
本博客參考了以下資料,一並致謝!
http://www.cnblogs.com/guoming0000/archive/2012/09/27/2706019.html
http://blog.csdn.net/zouxy09/article/details/45276053
http://blog.csdn.net/zhouxuguang236/article/details/40212143
http://wenku.baidu.com/view/6023207e168884868762d644.html
《數值分析簡明教程》 王兵團 張作泉 趙平福 編著