在opencv3中的機器學習算法


在opencv3.0中,提供了一個ml.cpp的文件,這里面全是機器學習的算法,共提供了這么幾種:

1、正態貝葉斯:normal Bayessian classifier    我已在另外一篇博文中介紹過:在opencv3中實現機器學習之:利用正態貝葉斯分類

2、K最近鄰:k nearest neighbors classifier

3、支持向量機:support vectors machine    請參考我的另外一篇博客:在opencv3中實現機器學習之:利用svm(支持向量機)分類

4、決策樹: decision tree

5、ADA Boost:adaboost

6、梯度提升決策樹:gradient boosted trees

7、隨機森林:random forest

8、人工神經網絡:artificial neural networks

9、EM算法:expectation-maximization

這些算法在任何一本機器學習書本上都可以介紹過,他們大致的分類過程都很相似,主要分為三個環節:

一、收集樣本數據sampleData

二、訓練分類器mode

三、對測試數據testData進行預測

不同的地方就是在opencv中的參數設定,假設訓練數據為trainingDataMat,且已經標注好labelsMat。待測數據為testMat.

1、正態貝葉斯

 // 創建貝葉斯分類器
  Ptr<NormalBayesClassifier> model=NormalBayesClassifier::create();
    
    // 設置訓練數據
  Ptr<TrainData> tData =TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);

    //訓練分類器
    model->train(tData);
//預測數據
 float response = model->predict(testMat); 

2、K最近鄰

 Ptr<KNearest> knn = KNearest::create();  //創建knn分類器
    knn->setDefaultK(K);    //設定k值
    knn->setIsClassifier(true);
    // 設置訓練數據
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    knn->train(tData);
    float response = knn->predict(testMat);

3、支持向量機

Ptr<SVM> svm = SVM::create();    //創建一個分類器
    svm->setType(SVM::C_SVC);    //設置svm類型
    svm->setKernel(SVM::POLY); //設置核函數;
    svm->setDegree(0.5);
    svm->setGamma(1);
    svm->setCoef0(1);
    svm->setNu(0.5);
    svm->setP(0);
    svm->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER+TermCriteria::EPS, 1000, 0.01));
    svm->setC(C);
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    svm->train(tData);
    float response = svm->predict(testMat);

4、決策樹: decision tree

Ptr<DTrees> dtree = DTrees::create();  //創建分類器
    dtree->setMaxDepth(8);   //設置最大深度
    dtree->setMinSampleCount(2);  
    dtree->setUseSurrogates(false);
    dtree->setCVFolds(0); //交叉驗證
    dtree->setUse1SERule(false);
    dtree->setTruncatePrunedTree(false);
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    dtree->train(tData);
    float response = dtree->predict(testMat);

5、ADA Boost:adaboost

 Ptr<Boost> boost = Boost::create();
    boost->setBoostType(Boost::DISCRETE);
    boost->setWeakCount(100);
    boost->setWeightTrimRate(0.95);
    boost->setMaxDepth(2);
    boost->setUseSurrogates(false);
    boost->setPriors(Mat());
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    boost->train(tData);
    float response = boost->predict(testMat);

6、梯度提升決策樹:gradient boosted trees

此算法在opencv3.0中被注釋掉了,原因未知,因此此處提供一個老版本的算法。

GBTrees::Params params( GBTrees::DEVIANCE_LOSS, // loss_function_type
                         100, // weak_count
                         0.1f, // shrinkage
                         1.0f, // subsample_portion
                         2, // max_depth
                         false // use_surrogates )
                         );
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    Ptr<GBTrees> gbtrees = StatModel::train<GBTrees>(tData, params);
    float response = gbtrees->predict(testMat);

7、隨機森林:random forest

   Ptr<RTrees> rtrees = RTrees::create();
    rtrees->setMaxDepth(4);
    rtrees->setMinSampleCount(2);
    rtrees->setRegressionAccuracy(0.f);
    rtrees->setUseSurrogates(false);
    rtrees->setMaxCategories(16);
    rtrees->setPriors(Mat());
    rtrees->setCalculateVarImportance(false);
    rtrees->setActiveVarCount(1);
    rtrees->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER, 5, 0));
   Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
   rtrees->train(tData);
   float response = rtrees->predict(testMat);

8、人工神經網絡:artificial neural networks

 Ptr<ANN_MLP> ann = ANN_MLP::create();
    ann->setLayerSizes(layer_sizes);
    ann->setActivationFunction(ANN_MLP::SIGMOID_SYM, 1, 1);
    ann->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER+TermCriteria::EPS, 300, FLT_EPSILON));
    ann->setTrainMethod(ANN_MLP::BACKPROP, 0.001);
    Ptr<TrainData> tData = TrainData::create(trainingDataMat, ROW_SAMPLE, labelsMat);
    ann->train(tData);
    float response = ann->predict(testMat);

9、EM算法:expectation-maximization

EM算法與前面的稍微有點不同,它需要創建很多個model,將trainingDataMat分成很多個modelSamples,每個modelSamples訓練出一個model

訓練核心代碼為:

 int nmodels = (int)labelsMat.size();
    vector<Ptr<EM> > em_models(nmodels);
    Mat modelSamples;

    for( i = 0; i < nmodels; i++ )
    {
        const int componentCount = 3;

        modelSamples.release();
        for (j = 0; j < labelsMat.rows; j++)
        {
            if (labelsMat.at<int>(j,0)== i)
                modelSamples.push_back(trainingDataMat.row(j));
        }

        // learn models
        if( !modelSamples.empty() )
        {
            Ptr<EM> em = EM::create();
            em->setClustersNumber(componentCount);
            em->setCovarianceMatrixType(EM::COV_MAT_DIAGONAL);
            em->trainEM(modelSamples, noArray(), noArray(), noArray());
            em_models[i] = em;
        }
    }

預測:

 Mat logLikelihoods(1, nmodels, CV_64FC1, Scalar(-DBL_MAX));
 for( i = 0; i < nmodels; i++ )
            {
                if( !em_models[i].empty() )
                    logLikelihoods.at<double>(i) = em_models[i]->predict2(testMat, noArray())[0];
            }

 

這么多的機器學習算法,在實際用途中照我的理解其實只需要掌握svm算法就可以了。

ANN算法在opencv中也叫多層感知機,因此在訓練的時候,需要分多層。

EM算法需要為每一類創建一個model。

其中一些算法的具體代碼練習:在opencv3中的機器學習算法練習:對OCR進行分類


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM