基於手寫數字識別數據集的機器學習方法對比研究
摘要
研究意義:統計機器學習和深度學習都已被廣泛地應用。
主流研究方法:在相同的數據集上進行對比實驗。
前人研究存在的問題:在檢索范圍內,沒有發現統計學習方法與深度學習方法對比的工作。
我們的解決手段:本文在手寫數字識別數據集(MNIST)上,對比了主流的統計機器學習方法和深度學習方法的表現。
我們解決的還不錯:通過實驗證明了深度學習在 MNIST 數據集上的效果更好,准確率為97.50%;統計機器學習方法(SVM)准確率為93.71%。
Keywords: 手寫數字識別, MNIST, DNN, SVM, 統計機器學習, 深度學習
實驗
實驗設置
Epoch : 10
Train Data Sample : 60000
Test Data Sample : 10000
Image Shape : (28, 28, 1)
實驗結果
預測性能
| 方法 | Acc on Train | Acc on Test | Paramters |
|---|---|---|---|
| DNN | 0.9950 | 0.9808 | 1,238,730 |
| CNN+MaxPooling | 0.9906 | 0.9742 | 1,332,810 |
| kernel approximation + LinearSVC | 0.9378 | 0.9371 | N/A |
| SVC | 0.9899 | 0.9792 | N/A |
執行效率
CPU 80線程,128GB內存,固態硬盤
| 方法 | Training and Inference |
|---|---|
| DNN | 0m 38.849s |
| CNN+MaxPooling | 11m 19.786s |
| kernel approximation + LinearSVC | 0m 20.889s |
| SVC | 10m 54.445s |
結論
1.深度學習方法在足量的數據上,可以取得比統計學習方法更高的准確率;
2.CNN+MaxPooling方法在當前的“實驗設置”下,過擬合了;
3.在當前的“實驗設置”下,DNN方法的效果一致好於CNN+MaxPooling方法;
4.自帶核函數的SVM預測效果,好於近似核函數和線性SVM的組合方法;
5.自帶核函數的SVM,訓練時間和推斷時間都遠高於近似核函數和線性SVM的組合方法,高於DNN,略低於CNN;
代碼
DNN
# encoder=utf-8
from tensorflow import keras
from tensorflow.keras import Model, layers
from tensorflow.keras.utils import to_categorical
import numpy as np
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
# categorical labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
print(y_train.shape)
# Define and build the model
input_img = layers.Input(shape=28*28)
x = layers.Dense(28*28, activation='relu')(input_img)
x = layers.Dense(28*28, activation='sigmoid')(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input_img, x)
model.summary()
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics='acc'
)
model.fit(
x=x_train,
y=y_train,
batch_size=128,
epochs=10
)
loss, metric = model.evaluate(x=x_test, y=y_test, batch_size=128, )
print("cross entropy is %.4f, accuracy is %.4f" % (loss, metric))
CNN + MaxPooling
# encoder=utf-8
from tensorflow import keras
from tensorflow.keras import Model, layers
from tensorflow.keras.utils import to_categorical
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# normalize the data
x_train = x_train / 255.0
x_test = x_test / 255.0
# categorical labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
print(y_train.shape)
# Define and build the model
input_img = layers.Input(shape=(28, 28, 1))
x = layers.Conv2D(28*28, (3, 3))(input_img)
x = layers.MaxPooling2D((2, 2))(x)
x = layers.Flatten()(x)
x = layers.Dense(10, activation='softmax')(x)
model = Model(input_img, x)
model.summary()
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics='acc'
)
model.fit(
x=x_train,
y=y_train,
batch_size=128,
epochs=10
)
loss, metric = model.evaluate(x=x_test, y=y_test, batch_size=128, )
print("cross entropy is %.4f, accuracy is %.4f" % (loss, metric))
Kernel approximation + LinearSVM
# encoder=utf-8
from tensorflow import keras
import numpy as np
from sklearn.kernel_approximation import Nystroem
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
print(y_train.shape)
# Define and build the kernel mapping
x = np.concatenate((x_train, x_test))
print(x.shape)
# SVC is too slow to practice, hence we split the SVC into
# approximating kernel map (sklearn.kernel_approximation.Nystroem)
# and linear SVM (sklearn.svm.LinearSVC)
feature_map_nystroem = Nystroem(n_components=28*28)
feature_map_nystroem.fit(x)
x = feature_map_nystroem.transform(x)
x_train = x[:60000]
x_test = x[60000:]
print(x_train.shape)
print(x_test.shape)
cls = LinearSVC()
cls.fit(x_train, y_train)
y_pred = cls.predict(x_train)
ret = accuracy_score(y_train, y_pred)
print(ret)
y_pred = cls.predict(x_test)
ret = accuracy_score(y_test, y_pred)
print(ret)
SVC
# encoder=utf-8
from tensorflow import keras
import numpy as np
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
# Load Dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print(x_train.shape)
# Reshape the data
x_train = np.reshape(x_train, (len(x_train), 28 * 28)) / 255.0
x_test = np.reshape(x_test, (len(x_test), 28 * 28)) / 255.0
print(x_train.shape)
print(y_train.shape)
cls = SVC()
cls.fit(x_train, y_train)
y_pred = cls.predict(x_train)
ret = accuracy_score(y_train, y_pred)
print(ret)
y_pred = cls.predict(x_test)
ret = accuracy_score(y_test, y_pred)
print(ret)
