是自己最近學習 "An Introduction to Statistical Learning with Applications in R" 的一個筆記整理。
http://www-bcf.usc.edu/~gareth/ISL/
本書的作者是Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani,發表於February 11, 2013。
此書對統計入門,尤其是監督學習的各種方法,進行了系統性的介紹。更棒的是,每章最后的lab部分,結合了R語言應用實際問題,課后習題中也有專門的R語言練習。
習題的非官方答案可參考 http://blog.princehonest.com/stat-learning/
下面就開始啦~
Contents
- Introduction
- Statistical Learning: basic terminology, the K-nearest neighbor classifier
- Linear Regression
- Classification:logistic regression and linear discriminant analysis (LDA)
- Resampling Methods: cross-validation and the bootstrap
- Linear Model Selection and Regularization: stepwise selection, ridge regression, principal components regression, partial least squares, and the lasso.
- Moving Beyond Linearity: non-linear additive models
- Tree-Based Methods: bagging, boosting, and random forests
- Support Vector Machines
- Unsupervised Learning: principal components analysis (PCA), K-means clustering, and hierarchical clustering
A Brief History of Statistical Learning
- 1800's, method of least squares, linear regression
- 1936, Fisher's linear discriminant analysis (LDA)
- 1940, logistic regression
- 1970's, generalized linear models
- 1980's, classification and regression trees
- 1986, generalized additive models
- today, machine learning