【Spark機器學習速成寶典】模型篇08保序回歸【Isotonic Regression】(Python版)


目錄

  保序回歸原理

  保序回歸代碼(Spark Python)


 

保序回歸原理

   待續...

 返回目錄

 

保序回歸代碼(Spark Python) 

  

  代碼里數據:https://pan.baidu.com/s/1jHWKG4I 密碼:acq1

 

# -*-coding=utf-8 -*-  
from pyspark import SparkConf, SparkContext
sc = SparkContext('local')

import math
from pyspark.mllib.regression import LabeledPoint, IsotonicRegression, IsotonicRegressionModel
from pyspark.mllib.util import MLUtils

# Load and parse the data 加載和解析數據
def parsePoint(labeledData):
    return (labeledData.label, labeledData.features[0], 1.0)

data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_isotonic_regression_libsvm_data.txt")

# Create label, feature, weight tuples from input data with weight set to default value 1.0. 創建標簽,特征,權重的元組,並設置權重默認為1.0
parsedData = data.map(parsePoint)

# Split data into training (60%) and test (40%) sets. 分割數據集
training, test = parsedData.randomSplit([0.6, 0.4], 11)

# Create isotonic regression model from training data. 創建保序回歸模型
# Isotonic parameter defaults to true so it is only shown for demonstration 參數默認為true,這里只是用於展示
model = IsotonicRegression.train(training)

# Create tuples of predicted and real labels. 創建預測和真實標簽的元組
predictionAndLabel = test.map(lambda p: (model.predict(p[1]), p[0]))

# Calculate mean squared error between predicted and real labels.計算預測和真實標簽的均方誤差
meanSquaredError = predictionAndLabel.map(lambda pl: math.pow((pl[0] - pl[1]), 2)).mean()
print("Mean Squared Error = " + str(meanSquaredError)) #Mean Squared Error = 0.00863040529956

# Save and load model
model.save(sc, "myIsotonicRegressionModel")
sameModel = IsotonicRegressionModel.load(sc, "myIsotonicRegressionModel")
print sameModel.predict(data.collect()[0].features) #0.14987251

 

 返回目錄

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM