Hands-on Machine Learning with Scikit-Learn and TensorFlow


前言

         机器学习海啸

         在你的项目中使用机器学习

         目标和方法

         预备知识

         路线图

         其他资源

         使用本书的规则

         代码使用范例

         O’Reilly Safari

         How to Contact Us

         鸣谢

第一部分 机器学习的基本原理

         第一章 机器学习的风景

                   什么是机器学习

                   为什么使用机器学习

                   机器学习系统的种类

                            有监督和无监督学习

                                     有监督学习

                                     无监督学习

                                     半监督学习(Semisupervised learning)

                                     强化学习

                            批量和在线学习

                                     批量学习

                                     在线学习

                            基于实例的学习 vs 基于模型的学习

                                     基于实例的学习

                                     基于模型的学习

                                               Example 1-1. Training and running a linear model using Scikit-Learn

                   机器学习的主要挑战

                            缺少训练数据

                            缺少代表性的训练数据

                                     一个关于抽样偏差的著名例子

                            低质量数据

                            不相关的特征

                            训练集数据过拟合

                            训练集数据欠拟合

                            Stepping Back(重点回顾)

                            测试和验证

                                     没有免费午餐定理

                            练习

         第二章 端到端的机器学习项目

                   使用真实数据

                   全局概览

                            问题框架

                                     流程(piplines)

                            性能度量

                                     符号

                            假设检验

                   获取数据

                            创建工作空间

                                     创建一个独立的环境

                            下载数据

                            数据概览

                            创建一个测试集

                   数据分析与可视化

                            可视化地理数据

                            寻找相关性

                            特征组合的实验

                   为机器学习算法准备数据

                            数据清洗

                            处理文本和类别特征

                            自定义变换

                            特征缩放

                            变换流程

                   模型选择和模型训练

                            在数据集上训练和评估

                            使用交叉验证实现更好的评估

                   模型调优

                            网格搜索

                            随机搜索

                            方法集成

                            分析最好模型的误差

                            在测试集上评估你的系统

                   启动,监控和维护机器学习系统

                   试试看!

                   练习

         第三章 分类

                   MNIST数据集

                            from sklearn.datasets import fetch_mldata

                   训练一个二分类器

                            from sklearn.linear_model import SGDClassifier

                   性能测量

                            使用交叉验证测量准确度

                                     实现交叉验证

                                               from sklearn.model_selection import StratifiedKFold

                                               from sklearn.base import clone

                                               from sklearn.model_selection import cross_val_score

                                               from sklearn.base import BaseEstimator

                            混淆矩阵

                                     from sklearn.model_selection import cross_val_predict

                                     from sklearn.metrics import confusion_matrix

                            准确率和召回率

                                     from sklearn.metrics import precision_score, recall_score

                                     from sklearn.metrics import f1_score

                            精度和召回的权衡

                                     from sklearn.metrics import precision_recall_curve

                            ROC曲线

                                     from sklearn.metrics import roc_curve

                                     from sklearn.metrics import roc_auc_score

                                     from sklearn.ensemble import RandomForestClassifier

                   多分类

                            from sklearn.multiclass import OneVsOneClassifier

                            from sklearn.preprocessing import StandardScaler

                  误差分析

                   多标签分类

                            from sklearn.neighbors import KNeighborsClassifier

                   多输出分类

                   练习

         第四章 训练模型

                   线性回归

                            正规方程

                                     from sklearn.linear_model import LinearRegression

                            计算复杂度

                   梯度下降

                            批量梯度下降

                                     收敛速度

                            随机梯度下降

                                     from sklearn.linear_model import SGDRegressor

                            Mini-batch梯度下降

                   多项式回归

                            from sklearn.preprocessing import PolynomialFeatures

                   学习曲线

                            from sklearn.metrics import mean_squared_error

                            from sklearn.model_selection import train_test_split

                            from sklearn.pipeline import Pipeline

                            偏差和方差的平衡

                   线性模型的正则化

                            岭回归

                                     from sklearn.linear_model import Ridge

                            Lasso回归

                            弹性网络

                                     from sklearn.linear_model import ElasticNet

                            早停

                                     from sklearn.base import clone

                   逻辑回归

                            概率估计

                            模型训练和损失函数

                            决策边界

                                     from sklearn import datasets

                                     from sklearn.linear_model import LogisticRegression

                            Softmax回归

                                     交叉熵损失

                   练习

         第五章 支持向量机

                   线性可分支持向量机

                            软间隔分类

                                     from sklearn.svm import LinearSVC

                   非线性支持向量机

                            多项式核函数

                                     from sklearn.svm import SVC

                            添加相似度特征

                            高斯核函数

                            计算复杂度

                   支持向量回归

                            from sklearn.svm import LinearSVR

                   高级选项

                            决策函数和预测

                            训练目标

                            二次规划

                            拉格朗日对偶

                            核化的支持向量机

                            在线的支持向量机

                                     Hinge Loss

                   练习

         第六章 决策树

                   训练和可视化决策树

                            from sklearn.tree import DecisionTreeClassifier

                            from sklearn.tree import export_graphviz

                   预测

                            模型的可解释性:白盒和黑盒

                   评估类别概率

                   CART训练算法

                   计算复杂度

                   基尼系数还是信息熵

                   正则化参数

                   决策树回归

                            from sklearn.tree import DecisionTreeRegressor

                   不稳定性

                   练习

         第七章 集成学习和随机森林

                   投票分类器

                            from sklearn.ensemble import RandomForestClassifier

                            from sklearn.ensemble import VotingClassifier

                            from sklearn.linear_model import LogisticRegression

                            from sklearn.svm import SVC

                            from sklearn.metrics import accuracy_score

                   Bagging and Pasting

                            Bagging and Pasting in Scikit-Learn

                                     from sklearn.ensemble import BaggingClassifier

                            Out-of-Bag评估

                                     from sklearn.metrics import accuracy_score

                   随机补丁和随机子空间

                   随机森林

                            Extra-Trees

                            特征重要度

                                     from sklearn.ensemble import RandomForestClassifier

                   Boosting

                            AdaBoost

                                     from sklearn.ensemble import AdaBoostClassifier

                            Gradient Boosting

                                     from sklearn.tree import DecisionTreeRegressor

                                     from sklearn.ensemble import GradientBoostingRegressor

                   Stacking

                   练习

         第八章 降维

                   维数灾难

                   降维的主要方法

                            投射

                            流形学习

                   PCA

                            保存方差

                            主成分

                            投射到d维

                            Using Scikit-Learn

                                     from sklearn.decomposition import PCA

                            解释方差比

                            选择正确的维数

                            PCA压缩

                            增量PCA

                                     from sklearn.decomposition import IncrementalPCA

                            随机PCA

                            核化PCA

                                     from sklearn.decomposition import KernelPCA

                                     选择核函数和超参数调优

                            局部线性嵌入

                                     from sklearn.manifold import LocallyLinearEmbedding

                            其他的降维技术

                            练习

第二部分 神经网络和深度学习

         第九章 TensorFlow的安装与使用

                   安装

                   创建你的第一个计算图并在会话里运行

                            import tensorflow as tf

                   管理计算图

                   节点值的生命周期

                   使用TensorFlow实现线性回归

                   使用TensorFlow实现梯度下降

                            手工计算梯度

                            使用自动求导

                            使用优化器

                   使用数据训练算法

                   模型的保存与重用

                   使用TensorBaord可视化计算图和训练曲线

                   命名空间

                   模块化

                   共享变量

                   练习

         第十章 人工神经网络

                   从生物学到人工神经元

                            生物学的神经元

                            用神经元进行逻辑运算

                            感知机

                                     from sklearn.linear_model import Perceptron

                            多层感知机和反向传播

                   用TensorFlow的高级API训练一个多层感知机

                   使用普通的TensorFlow训练一个DNN

                            开发阶段

                                     from tensorflow.contrib.layers import fully_connected

                            执行阶段

                                     from tensorflow.examples.tutorials.mnist import input_data

                            使用神经网络

                   调优神经网络的超参数

                            隐藏层的数量

                            每个隐藏层神经元的个数

                            激活函数

                   练习

         第十章 训练深度神经网络

                   梯度消失和梯度爆炸

                            Xavier and He Initialization

                            非饱和激活函数

                            Batch Normalization

                                     使用TensorFlow实现Batch Normalization

                                               from tensorflow.contrib.layers import batch_norm

                            梯度截断

                   重用预训练层

                            重用TensorFlow模型

                            重用其他框架的模型

                            冻结低级层

                            缓存冻结层

                            调整,丢弃,替换高级别层

                            Model Zoos

                            无监督预训练

                            在一个辅助任务上预训练

                   更快的优化器

                            动量优化

                            Nesterov Accelerated Gradient

                            AdaGrad

                            RMSProp

                            Adam Optimization

                                     训练一个稀疏模型

                            学习速率调度

                   通过正则化避免过拟合

                            Early Stopping

                            ℓ1 and ℓ2 Regularization

                            Dropout

                                     from tensorflow.contrib.layers import dropout

                            Max-Norm Regularization

                            数据增强

                   实践指导

                   练习

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM