spark-sklearn（spark擴展scikitlearn）

本文轉載自查看原文 2017-08-07 09:23 3598 5.Spark-Learning

（1）官方規定安裝條件：此包裝具有以下要求：

-*最新版本的scikit學習。版本0.17已經過測試，舊版本也可以使用。
- *Spark> = 2.0。 Spark可以從對應官網下載
[Spark官方網站]（http://spark.apache.org/）

-*為了使用spark-sklearn，您需要使用pyspark解釋器或其他Spark兼容的python解釋器。

有關詳細信息，請參閱[Spark指南]（https://spark.apache.org/docs/latest/programming-guide.html#overview）。
- （https://nose.readthedocs.org）（僅測試依賴關系）

英文原文：This package has the following requirements:
- a recent version of scikit-learn. Version 0.17 has been tested, older versions may work too.
- Spark >= 2.0. Spark may be downloaded from the
[Spark official website](http://spark.apache.org/) In order to use spark-sklearn, you need to use the pyspark interpreter or another Spark-compliant python interpreter. See the [Spark guide](https://spark.apache.org/docs/latest/programming-guide.html#overview) for more details.
- [nose](https://nose.readthedocs.org) (testing dependency only)

（2）首先安裝pyspark：

參考為的博客：http://www.cnblogs.com/jackchen-Net/p/6667205.html#_label5

（3）訪問網址：https://pypi.python.org/pypi/spark-sklearn

目前Spark集成了Scikit-learn包，這樣可以極大的簡化了python數據科學家們的工作，這個包可以在Spark集群上自動分配模型參數優化計算任務

（4）官方文檔的例子測試

 1 ## Example
 2 
 3 Here is a simple example that runs a grid search with Spark. See the [Installation](#Installation) section on how to install spark-sklearn.
 4 
 5 ```python
 6 from sklearn import svm, grid_search, datasets
 7 from spark_sklearn import GridSearchCV
 8 iris = datasets.load_iris()
 9 parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
10 svr = svm.SVC()
11 clf = GridSearchCV(sc, svr, parameters)
12 clf.fit(iris.data, iris.target)
13 ```
14 
15 This classifier can be used as a drop-in replacement for any scikit-learn classifier, with the same API.

END~

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 spark-sklearn TypeError: 'JavaPackage' object is not callable Hive擴展功能(七)--Hive On Spark Spark之如何設置Spark資源 Spark之spark shell spark之spark本地運行 Spark（四） -- Spark工作機制【Spark】Spark容錯機制 Spark學習之Spark Core Spark學習之Spark SQL Spark入門——什么是Hadoop，為什么是Spark?