Theano mnist數據集格式

本文轉載自查看原文 2014-08-02 10:26 4152 DL

首先鏈接一篇大牛的Theano文檔翻譯：http://www.cnblogs.com/xueliangliu/archive/2013/04/03/2997437.html

里面有mnist.pkl.gz 手動下載地址（因為代碼里也有自動下載方法）

那么我不是做圖像處理的，所以對圖像的存儲格式沒有什么概念，我要以其他方式輸入進theano程序中怎么辦呢?

於是就得分析它的存儲格式。代碼(logistic_sgd.py,line 195)注釋中說的已經很清楚了：

#train_set, valid_set, test_set format: tuple(input, target)
#input is an numpy.ndarray of 2 dimensions (a matrix)
#witch row's correspond to an example. target is a
#numpy.ndarray of 1 dimensions (vector)) that have the same length as
#the number of rows in the input. It should give the target
#target to the example with the same index in the input.

那么就是說train_X是一個rows行2列的矩陣，train_Y是一個rows維的向量，而train_set是train_X和train_Y的一個組合

那么我們只需要讀文件構建矩陣和向量，然后share成theano程序里的類型就ok啦

===================割=========================

想不到后來又重拾DL，如今已經是今非昔比了啊

再次補充一下Mnist數據集的格式

import cPickle, gzip, numpy

# Load the dataset
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()

事實證明它會返回一個tuple，分別是train vali test集。

每個集有兩維，以train set為例，分別是(50000, 784) (50000,1)代表着5W個樣本和5W個label，

每個樣本有784個維度 = 28*28

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MNIST 數據集分類導出MNIST的數據集 keras加載mnist數據集 RNN入門（一）識別MNIST數據集 Fashion MNIST數據集介紹下載並使用MNIST數據集將mnist數據集存儲到本地文件 TensorFlow筆記五：將cifar10和Mnist數據集文件復原成圖片格式 tensorflow中導入下載到本地的mnist數據集 TensorFlow——MNIST手寫數據集