Spark機器學習(2)：邏輯回歸算法

本文轉載自查看原文 2017-06-15 17:55 4044 MLlib/ LogisticRegressionWithSGD/ Spark/ Machine Learning/Computer Vision/ LogisticRegression/ LogisticRegressionWithLBFGS

邏輯回歸本質上也是一種線性回歸，和普通線性回歸不同的是，普通線性回歸特征到結果輸出的是連續值，而邏輯回歸增加了一個函數g(z)，能夠把連續值映射到0或者1。

MLLib的邏輯回歸類有兩個：LogisticRegressionWithSGD和LogisticRegressionWithLBFGS，前者基於隨機梯度下降，只支持2分類，后者基於LBFGS優化損失函數，支持多分類。

直接上代碼：

import org.apache.log4j.{Level, Logger}
import org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
import org.apache.spark.mllib.evaluation.MulticlassMetrics
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.util.MLUtils
import org.apache.spark.mllib.regression.LabeledPoint

object LogisticRegression {
  def main(args: Array[String]) {
    // 設置運行環境
    val conf = new SparkConf().setAppName("Logistic Regression Test")
      .setMaster("spark://master:7077").setJars(Seq("E:\\Intellij\\Projects\\MachineLearning\\MachineLearning.jar"))
    val sc = new SparkContext(conf)
    Logger.getRootLogger.setLevel(Level.WARN)

    // 讀取樣本數據,格式化為LIBSVM的RDD
    val dataRDD = MLUtils.loadLibSVMFile(sc, "hdfs://master:9000/ml/data/sample_libsvm_data.txt")

    // 樣本數據划分,訓練樣本占0.7,測試樣本占0.3
    val dataParts = dataRDD.randomSplit(Array(0.7, 0.3), seed = 25L)
    val trainRDD = dataParts(0).cache()
    val testRDD = dataParts(1)

    // 建立邏輯回歸模型並訓練
    val LRModel = new LogisticRegressionWithLBFGS().setNumClasses(10).run(trainRDD)

    // 對測試樣本進行測試
    val prediction = testRDD.map {
      case LabeledPoint(label, features) =>
        val prediction = LRModel.predict(features)
        (prediction, label)
    }
    val showPrediction = prediction.take(10)
    // 輸出測試結果
    println("Prediction" + "\t" + "Label")
    for (i <- 0 to showPrediction.length - 1) {
      println(showPrediction(i)._1 + "\t" + showPrediction(i)._2)
    }

    // 計算誤差並輸出
    val metrics = new MulticlassMetrics(prediction)
    val precision = metrics.precision
    println("Precision = " + precision)
  }

}

運行結果：

可見模型預測得非常准確。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 4.機器學習之邏輯回歸算法 Spark機器學習(1)：線性回歸算法 python機器學習（六）回歸算法-邏輯回歸 Python機器學習算法 — 邏輯回歸（Logistic Regression）機器學習算法 --- 邏輯回歸及梯度下降【機器學習】算法原理詳細推導與實現(二):邏輯回歸機器學習算法與Python實踐之（七）邏輯回歸（Logistic Regression）機器學習之二：分類算法之邏輯回歸手擼機器學習算法 - 邏輯回歸機器學習算法--邏輯回歸原理介紹