CS231n——圖像分類(KNN實現)


  圖像分類

  目標:已有固定的分類標簽集合,然后對於輸入的圖像,從分類標簽集合中找出一個分類標簽,最后把分類標簽分配給該輸入圖像。

  圖像分類流程

  • 輸入:輸入是包含N個圖像的集合,每個圖像的標簽是K種分類標簽中的一種。這個集合稱為訓練集。
  • 學習:這一步的任務是使用訓練集來學習每個類到底長什么樣。一般該步驟叫做訓練分類器或者學習一個模型。
  • 評價:讓分類器來預測它未曾見過的圖像的分類標簽,把分類器預測的標簽和圖像真正的分類標簽對比,並以此來評價分類器的質量。

  Nearest Neighbor分類器

  數據集:CIFAR-10這是一個非常流行的圖像分類數據集,包含了60000張32X32的小圖像。每張圖像都有10種分類標簽中的一種。這60000張圖像被分為包含50000張圖像的訓練集和包含10000張圖像的測試集。

  Nearest Neighbor圖像分類思想:拿測試圖片和訓練集中每一張圖片去比較,然后將它認為最相似的那個訓練集圖片的標簽賦給這張測試圖片。
  如何比較來那個張圖片?
  在本例中,就是比較32x32x3的像素塊。最簡單的方法就是逐個像素比較,最后將差異值全部加起來。換句話說,就是將兩張圖片先轉化為兩個向量I_1和I_2,然后計算他們的L1距離:
L1距離公式
這里的求和是針對所有的像素。下面是整個比較流程的圖例:


  計算向量間的距離有很多種方法,另一個常用的方法是L2距離,從幾何學的角度,可以理解為它在計算兩個向量間的歐式距離。L2距離的公式如下:
L2距離公式
  L1和L2比較:比較這兩個度量方式是挺有意思的。在面對兩個向量之間的差異時,L2比L1更加不能容忍這些差異。也就是說,相對於1個巨大的差異,L2距離更傾向於接受多個中等程度的差異。L1和L2都是在p-norm常用的特殊形式。

  k-Nearest Neighbor分類器(KNN)

  KNN圖像分類思想:與其只找最相近的那1個圖片的標簽,我們找最相似的k個圖片的標簽,然后讓他們針對測試圖片進行投票,最后把票數最高的標簽作為對測試圖片的預測。
  如何選擇k值?
  交叉驗證:假如有1000張圖片,我們將訓練集平均分成5份,其中4份用來訓練,1份用來驗證。然后我們循環着取其中4份來訓練,其中1份來驗證,最后取所有5次驗證結果的平均值作為算法驗證結果。


  這就是5份交叉驗證對k值調優的例子。針對每個k值,得到5個准確率結果,取其平均值,然后對不同k值的平均表現畫線連接。本例中,當k=10的時算法表現最好(對應圖中的准確率峰值)。如果我們將訓練集分成更多份數,直線一般會更加平滑(噪音更少)。

  k-Nearest Neighbor分類器的優劣

  優點:

  • 思路清晰,易於理解,實現簡單;
  • 算法的訓練不需要花時間,因為其訓練過程只是將訓練集數據存儲起來。

  缺點:測試要花費大量時間計算,因為每個測試圖像需要和所有存儲的訓練圖像進行比較。

  實際應用k-NN

  如果你希望將k-NN分類器用到實處(最好別用到圖像上,若是僅僅作為練手還可以接受),那么可以按照以下流程:

  1. 預處理你的數據:對你數據中的特征進行歸一化(normalize),讓其具有零平均值(zero mean)和單位方差(unit variance)。在后面的小節我們會討論這些細節。本小節不討論,是因為圖像中的像素都是同質的,不會表現出較大的差異分布,也就不需要標准化處理了。
  2. 如果數據是高維數據,考慮使用降維方法,比如PCA(wiki ref, CS229ref, blog ref)或隨機投影。
  3. 將數據隨機分入訓練集和驗證集。按照一般規律,70%-90% 數據作為訓練集。這個比例根據算法中有多少超參數,以及這些超參數對於算法的預期影響來決定。如果需要預測的超參數很多,那么就應該使用更大的驗證集來有效地估計它們。如果擔心驗證集數量不夠,那么就嘗試交叉驗證方法。如果計算資源足夠,使用交叉驗證總是更加安全的(份數越多,效果越好,也更耗費計算資源)。
  4. 在驗證集上調優,嘗試足夠多的k值,嘗試L1和L2兩種范數計算方式。
  5. 如果分類器跑得太慢,嘗試使用Approximate Nearest Neighbor庫(比如FLANN)來加速這個過程,其代價是降低一些准確率。
  6. 對最優的超參數做記錄。記錄最優參數后,是否應該讓使用最優參數的算法在完整的訓練集上運行並再次訓練呢?因為如果把驗證集重新放回到訓練集中(自然訓練集的數據量就又變大了),有可能最優參數又會有所變化。在實踐中,不要這樣做。千萬不要在最終的分類器中使用驗證集數據,這樣做會破壞對於最優參數的估計。直接使用測試集來測試用最優參數設置好的最優模型,得到測試集數據的分類准確率,並以此作為你的kNN分類器在該數據上的性能表現。

  課程作業

  KNN實現代碼:

 

  1 import numpy as np
  2 
  3 #http://blog.csdn.net/geekmanong/article/details/51524402
  4 #http://www.cnblogs.com/daihengchen/p/5754383.html
  5 #http://blog.csdn.net/han784851198/article/details/53331104
  6 class KNearestNeighbor(object):
  7   """ a kNN classifier with L2 distance """
  8 
  9   def __init__(self):
 10     pass
 11 
 12   def train(self, X, y):
 13     """
 14     Train the classifier. For k-nearest neighbors this is just 
 15     memorizing the training data.
 16     訓練分類器。對於KNN算法,此處只需要存儲訓練數據即可。
 17     Inputs:
 18     - X: A numpy array of shape (num_train, D) containing the training data
 19       consisting of num_train samples each of dimension D.
 20     - y: A numpy array of shape (N,) containing the training labels, where
 21          y[i] is the label for X[i].
 22     """
 23     self.X_train = X
 24     self.y_train = y
 25     
 26   def predict(self, X, k=1, num_loops=0):
 27     """
 28     Predict labels for test data using this classifier.
 29     基於該分類器,預測測試數據的標簽分類。
 30     Inputs:
 31     - X: A numpy array of shape (num_test, D) containing test data consisting
 32          of num_test samples each of dimension D.測試數據集
 33     - k: The number of nearest neighbors that vote for the predicted labels.
 34     - num_loops: Determines which implementation to use to compute distances
 35       between training points and testing points.選擇距離算法的實現方法
 36 
 37     Returns:
 38     - y: A numpy array of shape (num_test,) containing predicted labels for the
 39       test data, where y[i] is the predicted label for the test point X[i].  
 40     """
 41     if num_loops == 0:
 42       dists = self.compute_distances_no_loops(X)
 43     elif num_loops == 1:
 44       dists = self.compute_distances_one_loop(X)
 45     elif num_loops == 2:
 46       dists = self.compute_distances_two_loops(X)
 47     else:
 48       raise ValueError('Invalid value %d for num_loops' % num_loops)
 49 
 50     return self.predict_labels(dists, k=k)
 51 
 52   def compute_distances_two_loops(self, X):
 53     """
 54     Compute the distance between each test point in X and each training point
 55     in self.X_train using a nested loop over both the training data and the 
 56     test data.    兩層循環計算L2距離
 57 
 58     Inputs:
 59     - X: A numpy array of shape (num_test, D) containing test data.
 60 
 61     Returns:
 62     - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
 63       is the Euclidean distance between the ith test point and the jth training
 64       point.
 65     """
 66     num_test = X.shape[0]
 67     num_train = self.X_train.shape[0]
 68     dists = np.zeros((num_test, num_train))
 69     for i in range(num_test):
 70       for j in range(num_train):
 71         #####################################################################
 72         # TODO:                                                             #
 73         # Compute the l2 distance between the ith test point and the jth    #
 74         # training point, and store the result in dists[i, j]. You should   #
 75         # not use a loop over dimension.                                    #
 76         #####################################################################
 77         test_row = X[i, :]
 78         train_row = self.X_train[j, :]
 79         dists[i, j] = np.sqrt(np.sum((test_row - train_row)**2))
 80     
 81     return dists
 82         #####################################################################
 83         #                       END OF YOUR CODE                            #
 84         #####################################################################
 85     return dists
 86 
 87   def compute_distances_one_loop(self, X):
 88     """
 89     Compute the distance between each test point in X and each training point
 90     in self.X_train using a single loop over the test data.   一層循環計算L2距離
 91 
 92     Input / Output: Same as compute_distances_two_loops
 93     """
 94     num_test = X.shape[0]
 95     num_train = self.X_train.shape[0]
 96     dists = np.zeros((num_test, num_train))
 97     for i in range(num_test):
 98       #######################################################################
 99       # TODO:                                                               #
100       # Compute the l2 distance between the ith test point and all training #
101       # points, and store the result in dists[i, :].                        #
102       #######################################################################
103       test_row = X[i, :]
104       dists[i,:] = np.sqrt(np.sum(test_row - self.X_train)**2) #numpy廣播機制
105       #######################################################################
106       #                         END OF YOUR CODE                            #
107       #######################################################################
108     return dists
109 
110   def compute_distances_no_loops(self, X):
111     """
112     Compute the distance between each test point in X and each training point
113     in self.X_train using no explicit loops.    無循環計算L2距離
114 
115     Input / Output: Same as compute_distances_two_loops
116     """
117     num_test = X.shape[0]
118     num_train = self.X_train.shape[0]
119     dists = np.zeros((num_test, num_train)) 
120     #########################################################################
121     # TODO:                                                                 #
122     # Compute the l2 distance between all test points and all training      #
123     # points without using any explicit loops, and store the result in      #
124     # dists.                                                                #
125     #                                                                       #
126     # You should implement this function using only basic array operations; #
127     # in particular you should not use functions from scipy.                #
128     #                                                                       #
129     # HINT: Try to formulate the l2 distance using matrix multiplication    #
130     #       and two broadcast sums.                                         #
131     #########################################################################
132     X_sq = np.square(X).sum(axis=1)
133     X_train_sq = np.square(self.X_train).sum(axis=1)
134     dists = np.sqrt(-2*np.dot(X, self.X_train.T) + X_train_sq + np.matrix(X_sq).T)
135     dists = np.array(dists)
136     #########################################################################
137     #                         END OF YOUR CODE                              #
138     #########################################################################
139     return dists
140 
141   def predict_labels(self, dists, k=1):
142     """
143     Given a matrix of distances between test points and training points,
144     predict a label for each test point.
145 
146     Inputs:
147     - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
148       gives the distance betwen the ith test point and the jth training point.
149 
150     Returns:
151     - y: A numpy array of shape (num_test,) containing predicted labels for the
152       test data, where y[i] is the predicted label for the test point X[i].  
153     """
154     num_test = dists.shape[0]
155     y_pred = np.zeros(num_test)
156     for i in range(num_test):
157       # A list of length k storing the labels of the k nearest neighbors to
158       # the ith test point.
159       closest_y = []
160       #########################################################################
161       # TODO:                                                                 #
162       # Use the distance matrix to find the k nearest neighbors of the ith    #
163       # testing point, and use self.y_train to find the labels of these       #
164       # neighbors. Store these labels in closest_y.                           #
165       # Hint: Look up the function numpy.argsort.                             #
166       # numpy.argsort.函數返回的是數組值從小到大的索引值                                           #
167       #########################################################################
168       closest_y = self.y_train[np.argsort(dists[i,:])[:k]] 
169       #########################################################################
170       # TODO:                                                                 #
171       # Now that you have found the labels of the k nearest neighbors, you    #
172       # need to find the most common label in the list closest_y of labels.   #
173       # Store this label in y_pred[i]. Break ties by choosing the smaller     #
174       # label.                                                                #
175       #########################################################################
176       #np.bincount:統計每一個元素出現的次數
177       y_pred[i] = np.argmax(np.bincount(closest_y))
178       #########################################################################
179       #                           END OF YOUR CODE                            # 
180       #########################################################################
181 
182     return y_pred
183 
184 k_nearest_neighbor.py
k_nearest_neighbor.py

 

  測試和交叉驗證代碼:

  1 #coding:utf-8
  2 '''
  3 Created on 2017年3月21日
  4 
  5 @author: 206
  6 '''
  7 import random
  8 import numpy as np
  9 from assignment1.data_utils import load_CIFAR10
 10 from  assignment1.classifiers.k_nearest_neighbor import KNearestNeighbor
 11 import matplotlib.pyplot as plt
 12 
 13 # This is a bit of magic to make matplotlib figures appear inline in the notebook
 14 # rather than in a new window.
 15 plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
 16 plt.rcParams['image.interpolation'] = 'nearest'
 17 plt.rcParams['image.cmap'] = 'gray'
 18 
 19 X_train, y_train, X_test, y_test = load_CIFAR10('../datasets')
 20 
 21 # As a sanity check, we print out the size of the training and test data.
 22 print('Training data shape: ', X_train.shape)
 23 print('Training labels shape: ', y_train.shape)
 24 print('Test data shape: ', X_test.shape)
 25 print('Test labels shape: ', y_test.shape)
 26 
 27 # 從數據集中展示一部分數據
 28 # 每個類別展示若干張對應圖片
 29 classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
 30 num_classes = len(classes)
 31 samples_per_class = 7
 32 for y, cls in enumerate(classes):
 33     idxs = np.flatnonzero(y_train == y)
 34     idxs = np.random.choice(idxs, samples_per_class, replace=False)
 35     for i, idx in enumerate(idxs):
 36         plt_idx = i * num_classes + y + 1
 37         plt.subplot(samples_per_class, num_classes, plt_idx)
 38         plt.imshow(X_train[idx].astype('uint8'))
 39         plt.axis('off')
 40         if i == 0:
 41             plt.title(cls)
 42 plt.show()
 43 
 44 # 截取部分樣本數據,以提高本作業的執行效率
 45 num_training = 5000
 46 mask = range(num_training)
 47 X_train = X_train[mask]
 48 y_train = y_train[mask]
 49 
 50 num_test = 500
 51 mask = range(num_test)
 52 X_test = X_test[mask]
 53 y_test = y_test[mask]
 54 
 55 # reshape訓練和測試數據,轉換為行的形式
 56 X_train = np.reshape(X_train, (X_train.shape[0], -1))
 57 X_test = np.reshape(X_test, (X_test.shape[0], -1))
 58 
 59 print(X_train.shape)
 60 print(X_test.shape)
 61 
 62 classifier = KNearestNeighbor()
 63 classifier.train(X_train, y_train)
 64 
 65 dists = classifier.compute_distances_two_loops(X_test)
 66 print(dists.shape)
 67 
 68 plt.imshow(dists, interpolation='none')
 69 plt.show()
 70 
 71 # Now implement the function predict_labels and run the code below:
 72 # k=1時
 73 y_test_pred = classifier.predict_labels(dists, k=1)
 74 
 75 # Compute and print the fraction of correctly predicted examples
 76 num_correct = np.sum(y_test_pred == y_test)
 77 accuracy = float(num_correct) / num_test
 78 print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
 79 
 80 # k=5時
 81 y_test_pred = classifier.predict_labels(dists, k=5)
 82 num_correct = np.sum(y_test_pred == y_test)
 83 accuracy = float(num_correct) / num_test
 84 print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
 85 
 86 ####測試三種距離計算法的效率
 87 
 88 dists_one = classifier.compute_distances_one_loop(X_test)
 89 
 90 difference = np.linalg.norm(dists - dists_one, ord='fro')
 91 print('Difference was: %f' % (difference, ))
 92 if difference < 0.001:
 93   print('Good! The distance matrices are the same')
 94 else:
 95   print('Uh-oh! The distance matrices are different')
 96 
 97 dists_two = classifier.compute_distances_no_loops(X_test)
 98 difference = np.linalg.norm(dists - dists_two, ord='fro')
 99 print('Difference was: %f' % (difference, ))
100 if difference < 0.001:
101   print('Good! The distance matrices are the same')
102 else:
103   print('Uh-oh! The distance matrices are different')
104 
105 
106 def time_function(f, *args):
107   """
108   Call a function f with args and return the time (in seconds) that it took to execute.
109   """
110   import time
111   tic = time.time()
112   f(*args)
113   toc = time.time()
114   return toc - tic
115 
116 two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
117 print('Two loop version took %f seconds' % two_loop_time)
118 
119 one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)
120 print('One loop version took %f seconds' % one_loop_time)
121 
122 no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)
123 print('No loop version took %f seconds' % no_loop_time)
124 
125 # 交叉驗證
126 num_folds = 5
127 k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]
128 
129 X_train_folds = []
130 y_train_folds = []
131 ################################################################################
132 # TODO:                                                                        #
133 # Split up the training data into folds. After splitting, X_train_folds and    #
134 # y_train_folds should each be lists of length num_folds, where                #
135 # y_train_folds[i] is the label vector for the points in X_train_folds[i].     #
136 # Hint: Look up the numpy array_split function.                                #
137 ################################################################################
138 #數據划分
139 X_train_folds = np.array_split(X_train, num_folds);
140 y_train_folds = np.array_split(y_train, num_folds)
141 ################################################################################
142 #                                 END OF YOUR CODE                             #
143 ################################################################################
144 
145 # A dictionary holding the accuracies for different values of k that we find
146 # when running cross-validation. After running cross-validation,
147 # k_to_accuracies[k] should be a list of length num_folds giving the different
148 # accuracy values that we found when using that value of k.
149 
150 k_to_accuracies = {}
151 
152 ################################################################################
153 # TODO:                                                                        #
154 # Perform k-fold cross validation to find the best value of k. For each        #
155 # possible value of k, run the k-nearest-neighbor algorithm num_folds times,   #
156 # where in each case you use all but one of the folds as training data and the #
157 # last fold as a validation set. Store the accuracies for all fold and all     #
158 # values of k in the k_to_accuracies dictionary.                               #
159 ################################################################################
160 for k in k_choices:
161     k_to_accuracies[k] = []
162 
163 for k in k_choices:#find the best k-value
164     for i in range(num_folds):
165         X_train_cv = np.vstack(X_train_folds[:i]+X_train_folds[i+1:])
166         X_test_cv = X_train_folds[i]
167 
168         y_train_cv = np.hstack(y_train_folds[:i]+y_train_folds[i+1:])  #size:4000
169         y_test_cv = y_train_folds[i]
170 
171         classifier.train(X_train_cv, y_train_cv)
172         dists_cv = classifier.compute_distances_no_loops(X_test_cv)
173 
174         y_test_pred = classifier.predict_labels(dists_cv, k)
175         num_correct = np.sum(y_test_pred == y_test_cv)
176         accuracy = float(num_correct) / y_test_cv.shape[0]
177 
178         k_to_accuracies[k].append(accuracy)
179 ################################################################################
180 #                                 END OF YOUR CODE                             #
181 ################################################################################
182 
183 # Print out the computed accuracies
184 for k in sorted(k_to_accuracies):
185     for accuracy in k_to_accuracies[k]:
186         print('k = %d, accuracy = %f' % (k, accuracy))
187 
188 
189 
190 # plot the raw observations
191 for k in k_choices:
192   accuracies = k_to_accuracies[k]
193   plt.scatter([k] * len(accuracies), accuracies)
194 
195 # plot the trend line with error bars that correspond to standard deviation
196 accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])
197 accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])
198 plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)
199 plt.title('Cross-validation on k')
200 plt.xlabel('k')
201 plt.ylabel('Cross-validation accuracy')
202 plt.show()
testKNN.py

 

運行結果:

 1 Training data shape:  (50000, 32, 32, 3)
 2 Training labels shape:  (50000,)
 3 Test data shape:  (10000, 32, 32, 3)
 4 Test labels shape:  (10000,)
 5 (5000, 3072)
 6 (500, 3072)
 7 (500, 5000)
 8 Got 137 / 500 correct => accuracy: 0.274000
 9 Got 139 / 500 correct => accuracy: 0.278000
10 Difference was: 794038655446.820190
11 Uh-oh! The distance matrices are different
12 Difference was: 0.000000
13 Good! The distance matrices are the same
14 Two loop version took 52.208034 seconds
15 One loop version took 42.104724 seconds
16 No loop version took 0.371247 seconds
17 k = 1, accuracy = 0.263000
18 k = 1, accuracy = 0.257000
19 k = 1, accuracy = 0.264000
20 k = 1, accuracy = 0.278000
21 k = 1, accuracy = 0.266000
22 k = 3, accuracy = 0.239000
23 k = 3, accuracy = 0.249000
24 k = 3, accuracy = 0.240000
25 k = 3, accuracy = 0.266000
26 k = 3, accuracy = 0.254000
27 k = 5, accuracy = 0.248000
28 k = 5, accuracy = 0.266000
29 k = 5, accuracy = 0.280000
30 k = 5, accuracy = 0.292000
31 k = 5, accuracy = 0.280000
32 k = 8, accuracy = 0.262000
33 k = 8, accuracy = 0.282000
34 k = 8, accuracy = 0.273000
35 k = 8, accuracy = 0.290000
36 k = 8, accuracy = 0.273000
37 k = 10, accuracy = 0.265000
38 k = 10, accuracy = 0.296000
39 k = 10, accuracy = 0.276000
40 k = 10, accuracy = 0.284000
41 k = 10, accuracy = 0.280000
42 k = 12, accuracy = 0.260000
43 k = 12, accuracy = 0.295000
44 k = 12, accuracy = 0.279000
45 k = 12, accuracy = 0.283000
46 k = 12, accuracy = 0.280000
47 k = 15, accuracy = 0.252000
48 k = 15, accuracy = 0.289000
49 k = 15, accuracy = 0.278000
50 k = 15, accuracy = 0.282000
51 k = 15, accuracy = 0.274000
52 k = 20, accuracy = 0.270000
53 k = 20, accuracy = 0.279000
54 k = 20, accuracy = 0.279000
55 k = 20, accuracy = 0.282000
56 k = 20, accuracy = 0.285000
57 k = 50, accuracy = 0.271000
58 k = 50, accuracy = 0.288000
59 k = 50, accuracy = 0.278000
60 k = 50, accuracy = 0.269000
61 k = 50, accuracy = 0.266000
62 k = 100, accuracy = 0.256000
63 k = 100, accuracy = 0.270000
64 k = 100, accuracy = 0.263000
65 k = 100, accuracy = 0.256000
66 k = 100, accuracy = 0.263000
result

  交叉驗證結果:

 

  完整代碼見這里

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM