R語言 機器學習包


from:http://www.zhizhihu.com/html/y2009/410.html

機器學習是計算機科學和統計學的邊緣交叉領域,R關於機器學習的包主要包括以下幾個方面: 
1)神經網絡(Neural Networks):
nnet包執行單隱層前饋神經網絡,nnet是VR包的一部分(http://cran.r-project.org/web/packages/VR/index.html)。 
2)遞歸拆分(Recursive Partitioning):
遞歸拆分利用樹形結構模型,來做回歸、分類和生存分析,主要在rpart包(http://cran.r-project.org/web/packages/rpart/index.html)和tree包(http://cran.r-project.org/web/packages/tree/index.html)里執行,尤其推薦rpart包。Weka里也有這樣的遞歸拆分法,如:J4.8, C4.5, M5,包Rweka提供了R與Weka的函數的接口(http://cran.r-project.org/web/packages/RWeka/index.html)。 
party包提供兩類遞歸拆分算法,能做到無偏的變量選擇和停止標准:函數ctree()用非參條件推斷法檢測自變量和因變量的關系;而函數mob()能用來建立參數模型(http://cran.r-project.org/web/packages/party/index.html)。另外,party包里也提供二分支樹和節點分布的可視化展示。 
mvpart包是rpart的改進包,處理多元因變量的問題(http://cran.r-project.org/web/packages/mvpart/index.html)。rpart.permutation包用置換法(permutation)評估樹的有效性(http://cran.r-project.org/web/packages/rpart.permutation/index.html)。knnTree包建立一個分類樹,每個葉子節點是一個knn分類器(http://cran.r-project.org/web/packages/knnTree/index.html)。LogicReg包做邏輯回歸分析,針對大多數自變量是二元變量的情況(http://cran.r-project.org/web/packages/LogicReg/index.html)。maptree包(http://cran.r-project.org/web/packages/maptree/index.html)和pinktoe包(http://cran.r-project.org/web/packages/pinktoe/index.html)提供樹結構的可視化函數。 
3)隨機森林(Random Forests):
randomForest 包提供了用隨機森林做回歸和分類的函數(http://cran.r-project.org/web/packages/randomForest/index.html)。ipred包用bagging的思想做回歸,分類和生存分析,組合多個模型(http://cran.r-project.org/web/packages/ipred/index.html)。party包也提供了基於條件推斷樹的隨機森林法(http://cran.r-project.org/web/packages/party/index.html)。varSelRF包用隨機森林法做變量選擇(http://cran.r-project.org/web/packages/varSelRF/index.html)。 
4)Regularized and Shrinkage Methods:
lasso2包(http://cran.r-project.org/web/packages/lasso2/index.html)和lars包(http://cran.r-project.org/web/packages/lars/index.html)可以執行參數受到某些限制的回歸模型。elasticnet包可計算所有的收縮參數(http://cran.r-project.org/web/packages/elasticnet/index.html)。glmpath包可以得到廣義線性模型和COX模型的L1 regularization path(http://cran.r-project.org/web/packages/glmpath/index.html)。penalized包執行lasso (L1) 和ridge (L2)懲罰回歸模型(penalized regression models)(http://cran.r-project.org/web/packages/penalized/index.html)。pamr包執行縮小重心分類法(shrunken centroids classifier)(http://cran.r-project.org/web/packages/pamr/index.html)。earth包可做多元自適應樣條回歸(multivariate adaptive regression splines)(http://cran.r-project.org/web/packages/earth/index.html)。 
5)Boosting :
gbm包(http://cran.r-project.org/web/packages/gbm/index.html)和boost包(http://cran.r-project.org/web/packages/boost/index.html)執行多種多樣的梯度boosting算法,gbm包做基於樹的梯度下降boosting,boost包包括LogitBoost和L2Boost。GAMMoost包提供基於boosting的廣義相加模型(generalized additive models)的程序(http://cran.r-project.org/web/packages/GAMMoost/index.html)。mboost包做基於模型的boosting(http://cran.r-project.org/web/packages/mboost/index.html)。 
6)支持向量機(Support Vector Machines):
e1071包的svm()函數提供R和LIBSVM的接口 (http://cran.r-project.org/web/packages/e1071/index.html)。kernlab包為基於核函數的學習方法提供了一個靈活的框架,包括SVM、RVM……(http://cran.r-project.org/web/packages/kernlab/index.html) 。klaR 包提供了R和SVMlight的接口(http://cran.r-project.org/web/packages/klaR/index.html)。 
7)貝葉斯方法(Bayesian Methods):
BayesTree包執行Bayesian Additive Regression Trees (BART)算法(http://cran.r-project.org/web/packages/BayesTree/index.htmlhttp://www-stat.wharton.upenn.edu/~edgeorge/Research_papers/BART%206--06.pdf)。tgp包做Bayesian半參數非線性回歸(Bayesian nonstationary, semiparametric nonlinear regression)(http://cran.r-project.org/web/packages/tgp/index.html)。 
8)基於遺傳算法的最優化(Optimization using Genetic Algorithms):
gafit包(http://cran.r-project.org/web/packages/gafit/index.html)和rgenoud包(http://cran.r-project.org/web/packages/rgenoud/index.html)提供基於遺傳算法的最優化程序。 
9)關聯規則(Association Rules):
arules包提供了有效處理稀疏二元數據的數據結構,而且提供函數執Apriori和Eclat算法挖掘頻繁項集、最大頻繁項集、閉頻繁項集和關聯規則(http://cran.r-project.org/web/packages/arules/index.html)。 
10)模型選擇和確認(Model selection and validation):
e1071包的tune()函數在指定的范圍內選取合適的參數(http://cran.r-project.org/web/packages/e1071/index.html)。ipred包的errorest()函數用重抽樣的方法(交叉驗證,bootstrap)估計分類錯誤率(http://cran.r-project.org/web/packages/ipred/index.html)。svmpath包里的函數可用來選取支持向量機的cost參數C(http://cran.r-project.org/web/packages/svmpath/index.html)。ROCR包提供了可視化分類器執行效果的函數,如畫ROC曲線(http://cran.r-project.org/web/packages/ROCR/index.html)。caret包供了各種建立預測模型的函數,包括參數選擇和重要性量度(http://cran.r-project.org/web/packages/caret/index.html)。caretLSF包(http://cran.r-project.org/web/packages/caretLSF/index.html)和caretNWS(http://cran.r-project.org/web/packages/caretNWS/index.html)包提供了與caret包類似的功能。 
11)統計學習基礎(Elements of Statistical Learning):
書《The Elements of Statistical Learning: Data Mining, Inference, and Prediction 》(http://www-stat.stanford.edu/~tibs/ElemStatLearn/)里的數據集、函數、例子都被打包放在ElemStatLearn包里(http://cran.r-project.org/web/packages/ElemStatLearn/index.html)。

網址:http://cran.r-project.org/web/views/MachineLearning.html維護人員:Torsten Hothorn


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM