py4CV例子1貓狗大戰和Knn算法

本文轉載自查看原文 2018-03-18 18:42 977

1、什么是貓狗大戰；

數據集來源於Kaggle（一個為開發商和數據科學家提供舉辦機器學習競賽、托管數據庫、編寫和分享代碼的平台），原數據集有12500只貓和12500只狗，分為訓練、測試兩個部分。

2、什么是Knn算法：

K最近鄰（k-Nearest Neighbor，KNN）基本思想：如果一個樣本在特征空間中的k個最相似（即特征空間中最鄰近）的樣本中的大多數屬於某一個類別，則該樣本也屬於這個類別。

如果用比較平實的話來說，就是“ 我們已經存在了一個帶標簽的數據庫，現在輸入沒有標簽的新數據后，將新數據的每個特征與樣本集中數據對應的特征進行比較，然后算法提取樣本集中特征最相似（最近鄰）的分類標簽。”

上圖中的對象可以分成兩組，藍色方塊和紅色三角。每一組也可以稱為一個類。我們可以把所有的這些對象看成是一個城鎮中房子，而所有的房子分別屬於藍色和紅色家族，而這個城鎮就是所謂的特征空間。（你可以把一個特征空間看成是所有點的投影所在的空間。例如在一個 2D 的坐標空間中，每個數據都兩個特征 x 坐標和 y 坐標，你可以在 2D 坐標空間中表示這些數據。如果每個數據都有 3 個特征呢，我們就需要一個 3D 空間。N 個特征就需要 N 維空間，這個 N 維空間就是特征空間。在上圖中，我們可以認為是具有兩個特征色2D空間）。

現在城鎮中來了一個新人，他的新房子用綠色圓盤表示。我們要根據他房子的位置把他歸為藍色家族或紅色家族。我們把這過程成為分類。我們應該怎么做呢？因為我們正在學習看 kNN，那我們就使用一下這個算法吧。

一個方法就是查看他最近的鄰居屬於那個家族，從圖像中我們知道最近的是紅色三角家族。所以他被分到紅色家族。這種方法被稱為簡單近鄰，因為分類僅僅決定與它最近的鄰居。但是這里還有一個問題。紅色三角可能是最近的，但如果他周圍還有很多藍色方塊怎么辦呢？此時藍色方塊對局部的影響應該大於紅色三角。所以僅僅檢測最近的一個鄰居是不足的。所以我們檢測 k 個最近鄰居。誰在這k個鄰居中占據多數，那新的成員就屬於誰那一類。如果 k 等於 3，也就是在上面圖像中檢測 3 個最近的鄰居。他有兩個紅的和一個藍的鄰居，所以他還是屬於紅色家族。但是如果 k 等於 7 呢？他有 5 個藍色和 2 個紅色鄰居，現在他就會被分到藍色家族了。k 的取值對結果影響非常大。更有趣的是，如果 k 等於4呢？兩個紅兩個藍。這是一個死結。所以 k 的取值最好為奇數。這中根據 k個最近鄰居進行分類的方法被稱為 kNN。

在 kNN 中我們考慮了 k 個最近鄰居，但是我們給了這些鄰居相等的權重，這樣做公平嗎？以 k 等於 4 為例，我們說她是一個死結。但是兩個紅色三角比兩個藍色方塊距離新成員更近一些。所以他更應該被分為紅色家族。那用數學應該如何表示呢？我們要根據每個房子與新房子的距離對每個房子賦予不同的權重。距離近的具有更高的權重，距離遠的權重更低。然后我們根據兩個家族的權重和來判斷新房子的歸屬，誰的權重大就屬於誰。這被稱為修改過的 kNN。

那這里面些是重要的呢？

1、我們需要整個城鎮中每個房子的信息。因為我們要測量新來者到所有現存房子的距離，並在其中找到最近的。如果那里有很多房子，就要占用很大的內存和更多的計算時間。

2、訓練和處理幾乎不需要時間

那么，KNN算法的運算復雜度，想想都很大；但是它的實現思路，卻很簡單直接。

算法流程

1. 准備數據，對數據進行預處理

2. 選用合適的數據結構存儲訓練數據和測試元組

3. 設定參數，如k

4.維護一個大小為k的的按距離由大到小的優先級隊列，用於存儲最近鄰訓練元組。隨機從訓練元組中選取k個元組作為初始的最近鄰元組，分別計算測試元組到這k個元組的距離，將訓練元組標號和距離存入優先級隊列

5. 遍歷訓練元組集，計算當前訓練元組與測試元組的距離，將所得距離L 與優先級隊列中的最大距離Lmax

6. 進行比較。若L>=Lmax，則舍棄該元組，遍歷下一個元組。若L < Lmax，刪除優先級隊列中最大距離的元組，將當前訓練元組存入優先級隊列。

7. 遍歷完畢，計算優先級隊列中k 個元組的多數類，並將其作為測試元組的類別。

8. 測試元組集測試完畢后計算誤差率，繼續設定不同的k值重新進行訓練，最后取誤差率最小的k 值。 ^[1]

優點

1.簡單，易於理解，易於實現，無需估計參數，無需訓練；

2. 適合對稀有事件進行分類；

3.特別適合於多分類問題(multi-modal,對象具有多個類別標簽)， kNN比SVM的表現要好。 ^[1]

缺點

該算法在分類時有個主要的不足是，當樣本不平衡時，如一個類的樣本容量很大，而其他類樣本容量很小時，有可能導致當輸入一個新樣本時，該樣本的K個鄰居中大容量類的樣本占多數。該算法只計算“最近的”鄰居樣本，某一類的樣本數量很大，那么或者這類樣本並不接近目標樣本，或者這類樣本很靠近目標樣本。無論怎樣，數量並不能影響運行結果。

該方法的另一個不足之處是計算量較大，因為對每一個待分類的文本都要計算它到全體已知樣本的距離，才能求得它的K個最近鄰點。

可理解性差，無法給出像決策樹那樣的規則。

3、python環境：

OpenCV:圖像處理

Numpy 數值處理

scipy : 包含致力於科學計算中常見問題的各個工具箱

Matplotlib：繪圖

4、簡單例子(e1.py)

舉一個簡單的例子，和上面一樣有兩個類。我們將紅色家族標記為 Class-0，藍色家族標記為 Class-1。還要再創建 25 個訓練數據，把它們非別標記為 Class-0 或者 Class-1。

Numpy 中隨機數產生器可以幫助我們完成這個任務。

然后借助 Matplotlib 將這些點繪制出來。紅色家族顯示為紅色三角；藍色家族顯示為藍色方塊

 
             # -*- coding: utf-8 -*- 
            
             """ 
            
             Created on Tue Jan 28 18:00:18 2014 
            
             @author: duan 
            
             """ 
            
             import cv2 
            
             import numpy 
             as np 
            
             import matplotlib.pyplot 
             as plt 
            
             # 生成待訓練的數據 
            
              trainData = np.random.randint( 
             0, 
             100,( 
             25, 
             2)).astype(np.float32) 
            
             # 生成待訓練的標簽 
            
              responses = np.random.randint( 
             0, 
             2,( 
             25, 
             1)).astype(np.float32) 
            
             # 在圖中標記紅色樣本 
            
              red = trainData[responses.ravel()== 
             0] 
            
              plt.scatter(red[:, 
             0],red[:, 
             1], 
             80, 
             'r', 
             '^') 
            
             # 在圖中標記藍色樣本 
            
              blue = trainData[responses.ravel()== 
             1] 
            
              plt.scatter(blue[:, 
             0],blue[:, 
             1], 
             80, 
             'b', 
             's') 
            
             # 產生待分類數據 
            
              newcomer = np.random.randint( 
             0, 
             100,( 
             1, 
             2)).astype(np.float32) 
            
              plt.scatter(newcomer[:, 
             0],newcomer[:, 
             1], 
             80, 
             'g', 
             'o') 
            
             # 訓練樣本並產生分類 
            
              knn = cv2.ml.KNearest_create() 
            
              knn.train(trainData, cv2.ml.ROW_SAMPLE, responses) 
            
              ret, results, neighbours, dist = knn.findNearest(newcomer, 
             5) 
            
             # 打印結果 
            
             print( 
             "result: ", results) 
            
             print( 
             "neighbours: ", neighbours) 
            
             print( 
             "distance: ", dist) 
            
            plt.show() 
            運行結果

當然這是一個隨機的結果，每次運行都會有一些不同。在這次運算中

esult: [[0.]]

neighbours: [[1. 0. 0. 0. 0.]]

 
             distance: [[ 73. 193. 218. 226. 488.]] 
            

綠圓旁邊紅色三角形居多，所以被判定為紅色三角形。

5、使用 kNN 對手寫數字 OCR(e2.py)

OpenCV 安裝包中有一副圖片（/samples/ python2/data/digits.png）, 其中有 5000 個手寫數字（每個數字重復 500 遍）

編寫代碼

 
           import numpy 
           as np 
          
           import cv2 
          
           from matplotlib 
           import pyplot 
           as plt 
          
            img = cv2.imread( 
           'E:/template/digits.png') 
          
            gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) 
          
           # 每個數字是一個20x20的小圖，第一步就是將這個圖像分割成5000個不同的數字 
          
            cells = [np.hsplit(row, 
           100) 
           for row 
           in np.vsplit(gray, 
           50)] 
          
           # Make it into a Numpy array. It size will be (50,100,20,20) 
          
            x = np.array(cells) 
          
           # Now we prepare train_data and test_data. 
          
            train = x[:,: 
           50].reshape(- 
           1, 
           400).astype(np.float32) 
           # Size = (2500,400) 
          
            test = x[:, 
           50: 
           100].reshape(- 
           1, 
           400).astype(np.float32) 
           # Size = (2500,400) 
          
           # Create labels for train and test data 
          
            k = np.arange( 
           10) 
          
            train_labels = np.repeat(k, 
           250)[:,np.newaxis] 
          
            //直接用訓練的結果進行測試 
          
            test_labels = train_labels.copy() 
          
           # Initiate kNN, train the data, then test it with test data for k=1 
          
            knn = cv2.ml.KNearest_create() 
          
            knn.train(train, cv2.ml.ROW_SAMPLE,train_labels) 
          
            ret,result,neighbours,dist = knn.findNearest(test, 
           k= 
           5) 
          
           # Now we check the accuracy of classification 
          
           # For that, compare the result with test_labels and check which are wrong 
          
            matches = result == test_labels 
          
            correct = np.count_nonzero(matches) 
          
            accuracy = correct* 
           100.0/result.size 
          
           print(accuracy) 
          
           #將結果打包 
          
            np.savez_compressed ( 
           'knn_data.npz', 
           train=train, 
           train_labels=train_labels) 
          
           # 讀取已經打包的結果 
          
           with np.load( 
           'knn_data.npz') 
           as data: 
          
           print(data.files) 
          
            train = data[ 
           'train'] 
          
            train_labels = data[ 
           'train_labels']

最終得到的結果為

91.76

 
            ['train', 'train_labels'] 
           

6、帶自己訓練集的kNN 手寫數字 OCR（e3.py)

需要注意的是，運行這個例子，需要有兩個攝像頭，否則就需要修改

           cap = cv2.VideoCapture( 
          1) 
         

為

           cap = cv2.VideoCapture( 
          0) 
         

全部代碼

 
           import cv2 
          
           import numpy 
           as np 
          
           # 這是總共5000個數據，0-9各500個，我們讀入圖片后整理數據，這樣得到的train和trainLabel依次對應，圖像數據和標簽 
          
           def 
           initKnn(): 
          
            knn = cv2.ml.KNearest_create() 
          
            img = cv2.imread( 
           'E:/template/digits.png') 
          
            gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
          
            cells = [np.hsplit(row, 
           100) 
           for row 
           in np.vsplit(gray, 
           50)] 
          
            train = np.array(cells).reshape(- 
           1, 
           400).astype(np.float32) 
          
            trainLabel = np.repeat(np.arange( 
           10), 
           500) 
          
           return knn, train, trainLabel 
          
           # updateKnn是增加自己的訓練數據后更新Knn的操作。  
          
           def 
           updateKnn( 
           knn, 
           train, 
           trainLabel, 
           newData= 
           None, 
           newDataLabel= 
           None): 
          
           if newData != 
           None 
           and newDataLabel != 
           None: 
          
           print(train.shape, newData.shape) 
          
            newData = newData.reshape(- 
           1, 
           400).astype(np.float32) 
          
            train = np.vstack((train,newData)) 
          
            trainLabel = np.hstack((trainLabel,newDataLabel)) 
          
            knn.train(train,cv2.ml.ROW_SAMPLE,trainLabel) 
          
           return knn, train, trainLabel 
          
           # findRoi函數是找到每個數字的位置，用包裹其最小矩形的左上頂點的坐標和該矩形長寬表示(x, y, w, h)。 
          
           # 這里還用到了Sobel算子。edges是原始圖像形態變換之后的灰度圖，可以排除一些背景的影響， 
          
           # 比如本子邊緣、紙面的格子、手、筆以及影子等等，用edges來獲取數字圖像效果比Sobel獲取的邊界效果要好。 
          
           def 
           findRoi( 
           frame, 
           thresValue): 
          
            rois = [] 
          
            gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) 
          
            gray2 = cv2.dilate(gray, 
           None, 
           iterations= 
           2) 
          
            gray2 = cv2.erode(gray2, 
           None, 
           iterations= 
           2) 
          
            edges = cv2.absdiff(gray,gray2) 
          
            x = cv2.Sobel(edges,cv2.CV_16S, 
           1, 
           0) 
          
            y = cv2.Sobel(edges,cv2.CV_16S, 
           0, 
           1) 
          
            absX = cv2.convertScaleAbs(x) 
          
            absY = cv2.convertScaleAbs(y) 
          
            dst = cv2.addWeighted(absX, 
           0.5,absY, 
           0.5, 
           0) 
          
            ret, ddst = cv2.threshold(dst,thresValue, 
           255,cv2.THRESH_BINARY) 
          
            im, contours, hierarchy = cv2.findContours(ddst,cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) 
          
           for c 
           in contours: 
          
            x, y, w, h = cv2.boundingRect(c) 
          
           if w > 
           10 
           and h > 
           20: 
          
            rois.append((x,y,w,h)) 
          
           return rois, edges 
          
           # findDigit函數是用KNN來分類，並將結果返回。 
          
           # th是用來手動輸入訓練數據時顯示的圖片。20x20pixel的尺寸是OpenCV自帶digits.png中圖像尺寸， 
          
           # 因為我是在其基礎上更新數據，所以沿用這個尺寸。  
          
           def 
           findDigit( 
           knn, 
           roi, 
           thresValue): 
          
            ret, th = cv2.threshold(roi, thresValue, 
           255, cv2.THRESH_BINARY) 
          
            th = cv2.resize(th,( 
           20, 
           20)) 
          
            out = th.reshape(- 
           1, 
           400).astype(np.float32) 
          
            ret, result, neighbours, dist = knn.findNearest(out, 
           k= 
           5) 
          
           return 
           int(result[ 
           0][ 
           0]), th 
          
           # concatenate函數是拼接數字圖像並顯示的，用來輸入訓練數據。  
          
           def 
           concatenate( 
           images): 
          
            n = 
           len(images) 
          
            output = np.zeros( 
           20* 
           20*n).reshape(- 
           1, 
           20) 
          
           for i 
           in 
           range(n): 
          
            output[ 
           20*i: 
           20*(i+ 
           1),:] = images[i] 
          
           return output 
          
            knn, train, trainLabel = initKnn() 
          
            knn, train, trainLabel = updateKnn(knn, train, trainLabel) 
          
            cap = cv2.VideoCapture( 
           1) 
          
           #width = cap.get(cv2.CAP_PROP_FRAME_WIDTH) 
          
           #height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT) 
          
            width = 
           426* 
           2 
          
            height = 
           480 
          
            videoFrame = cv2.VideoWriter( 
           'frame.avi',cv2.VideoWriter_fourcc( 
           'M', 
           'J', 
           'P', 
           'G'), 
           25,( 
           int(width), 
           int(height)), 
           True) 
          
            count = 
           0 
          
           # 這是主函數循環部分，按“x”鍵會暫停屏幕並顯示獲取的數字圖像， 
          
           # 按“e”鍵會提示輸入看到的數字，在終端輸入數字用空格隔開， 
          
           # 按回車如果顯示“update KNN, Done!”則完成一次更新。 
          
           while 
           True: 
          
            ret, frame = cap.read() 
          
            frame = frame[:,: 
           426] 
          
            rois, edges = findRoi(frame, 
           50) 
          
            digits = [] 
          
           for r 
           in rois: 
          
            x, y, w, h = r 
          
            digit, th = findDigit(knn, edges[y:y+h,x:x+w], 
           50) 
          
            digits.append(cv2.resize(th,( 
           20, 
           20))) 
          
            cv2.rectangle(frame, (x,y), (x+w,y+h), ( 
           153, 
           153, 
           0), 
           2) 
          
            cv2.putText(frame, 
           str(digit), (x,y), cv2.FONT_HERSHEY_SIMPLEX, 
           1, ( 
           127, 
           0, 
           255), 
           2) 
          
            newEdges = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR) 
          
            newFrame = np.hstack((frame,newEdges)) 
          
            cv2.imshow( 
           'frame', newFrame) 
          
            videoFrame.write(newFrame) 
          
            key = cv2.waitKey( 
           1) & 
           0x 
           ff 
          
           if key == 
           ord( 
           ' '): 
          
           break 
          
           elif key == 
           ord( 
           'x'): 
          
            Nd = 
           len(digits) 
          
            output = concatenate(digits) 
          
            showDigits = cv2.resize(output,( 
           60, 
           60*Nd)) 
          
            cv2.imshow( 
           'digits', showDigits) 
          
            cv2.imwrite( 
           str(count)+ 
           '.png', showDigits) 
          
            count += 
           1 
          
           if cv2.waitKey( 
           0) & 
           0x 
           ff == 
           ord( 
           'e'): 
          
           pass 
          
           print( 
           'input the digits(separate by space):') 
          
            numbers = 
           input().split( 
           ' ') 
          
            Nn = 
           len(numbers) 
          
           if Nd != Nn: 
          
           print( 
           'update KNN fail!') 
          
           continue 
          
           try: 
          
           for i 
           in 
           range(Nn): 
          
            numbers[i] = 
           int(numbers[i]) 
          
           except: 
          
           continue 
          
            knn, train, trainLabel = updateKnn(knn, train, trainLabel, output, numbers) 
          
           print( 
           'update KNN, Done!') 
          
           print( 
           'Numbers of trained images:', 
           len(train)) 
          
           print( 
           'Numbers of trained image labels', 
           len(trainLabel)) 
          
            cap.release() 
          
            cv2.destroyAllWindows()

7、實現英文字母識別(e4.py)

/samples/ cpp/letter-recognition.data是OpenCV中已經訓練好的數據集

 
           # -*- coding: utf-8 -*- 
          
           """ 
          
           Created on Tue Jan 28 20:21:32 2014 
          
           @author: duan 
          
           """ 
          
           import cv2 
          
           import numpy 
           as np 
          
           import matplotlib.pyplot 
           as plt 
          
           # 讀取英文字母數據 
          
            data= np.loadtxt( 
           'e:/template/letter-recognition.data', 
           dtype= 
           'float32', 
           delimiter = 
           ',', 
          
           converters= { 
           0: 
           lambda 
           ch: 
           ord(ch)- 
           ord( 
           'A')}) 
          
           # 將數據分為train和test兩份，每份各1000個 
          
            train, test = np.vsplit(data, 
           2) 
          
           # 將訓練數據切割 features and responses 
          
            responses, trainData = np.hsplit(train,[ 
           1]) 
          
            labels, testData = np.hsplit(test,[ 
           1]) 
          
           # knn建模並且訓練 
          
            knn = cv2.ml.KNearest_create() 
          
            knn.train(traindata,cv2.ml.row_sample,responses) 
          
           #在測試數據上進行測試 
          
            ret, result, neighbours, dist = knn.findNearest(testData, 
           k= 
           5) 
          
            correct = np.count_nonzero(result == labels) 
          
            accuracy = correct* 
           100.0/ 
           10000 
          
           print(accuracy)

結果

93.06

8、實現最終的貓狗識別(catVSdog_knn.py)

 
           #在精簡dogvscat的精簡數據集上運行knn 
          
           import numpy 
           as np 
          
           import cv2 
          
           import os 
          
           import math 
          
           from matplotlib 
           import pyplot 
           as plt 
          
           #全局變量 
          
            RATIO = 
           0.2 
          
            train_dir = 
           'e:/template/dogvscat1K/' 
          
           #根據Ratio獲得訓練和測試數據集的圖片地址和標簽 
          
           def 
           get_files( 
           file_dir, 
           ratio): 
          
           ''' 
          
            Args: 
          
            file_dir: file directory 
          
            Returns: 
          
            list of images and labels 
          
            ''' 
          
            cats = [] 
          
            label_cats = [] 
          
            dogs = [] 
          
            label_dogs = [] 
          
           for 
           file 
           in os.listdir(file_dir): 
          
            name = 
           file.split( 
           sep= 
           '.') 
          
           if name[ 
           0]== 
           'cat': 
          
            cats.append(file_dir + 
           file) 
          
            label_cats.append( 
           0) 
          
           else: 
          
            dogs.append(file_dir + 
           file) 
          
            label_dogs.append( 
           1) 
          
           print( 
           '數據集中有  
           %d 
            cats 
           \n 
           以及  
           %d 
            dogs' %( 
           len(cats), 
           len(dogs))) 
          
           #圖片list和標簽list 
          
           #hstack 水平(按列順序)把數組給堆疊起來 
          
            image_list = np.hstack((cats, dogs)) 
          
            label_list = np.hstack((label_cats, label_dogs)) 
          
            temp = np.array([image_list, label_list]) 
          
            temp = temp.transpose() 
          
            np.random.shuffle(temp) 
          
            all_image_list = temp[:, 
           0] 
          
            all_label_list = temp[:, 
           1] 
          
            n_sample = 
           len(all_label_list) 
          
           #根據比率，確定訓練和測試數量 
          
            n_val = math.ceil(n_sample*ratio) 
           # number of validation samples 
          
            n_train = n_sample - n_val 
           # number of trainning samples 
          
            tra_images = [] 
          
            val_images = [] 
          
           #按照0-n_train為tra_images，后面位val_images的方式來排序 
          
           for index 
           in 
           range(n_train): 
          
            image = cv2.imread(all_image_list[index]) 
          
           #灰度，然后縮放 
          
            image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY) 
          
            image = cv2.resize(image,( 
           32, 
           32)) 
          
            tra_images.append(image) 
          
            tra_labels = all_label_list[:n_train] 
          
            tra_labels = [ 
           int( 
           float(i)) 
           for i 
           in tra_labels] 
          
           for index 
           in 
           range(n_val): 
          
            image = cv2.imread(all_image_list[n_train+index]) 
          
           #灰度，然后縮放 
          
            image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY) 
          
            image = cv2.resize(image,( 
           32, 
           32)) 
          
            val_images.append(image) 
          
            val_labels = all_label_list[n_train:] 
          
            val_labels = [ 
           int( 
           float(i)) 
           for i 
           in val_labels] 
          
           return tra_images,tra_labels,val_images,val_labels 
          
           #獲得數據集 
          
            _train, train_labels, _val, val_labels = get_files(train_dir, RATIO) 
          
            x = np.array(_train) 
          
            train = x.reshape(- 
           1, 
           32* 
           32).astype(np.float32) 
           # Size = (1000,900) 
          
            y = np.array(_val) 
          
            test = y.reshape(- 
           1, 
           32* 
           32).astype(np.float32) 
          
           # Initiate kNN, train the data, then test it with test data for k=1 
          
            knn = cv2.ml.KNearest_create() 
          
            knn.train(np.array(train), cv2.ml.ROW_SAMPLE,np.array(train_labels)) 
          
            ret,result,neighbours,dist = knn.findNearest(test, 
           k= 
           5) 
          
           # Now we check the accuracy of classification 
          
           # For that, compare the result with test_labels and check which are wrong 
          
            np_val_labels = np.array(val_labels)[:,np.newaxis] 
          
            matches = result == np_val_labels 
          
            correct = np.count_nonzero(matches) 
          
            accuracy = correct* 
           100.0/result.size 
          
           print(accuracy)

結果，在1000狗1000貓的數據集上，是55.55的准確率，而在全部的數據集上，是56.2的准確率。證明兩點

knn是有一定用途的；但是在不對特征進行詳細分析的基礎上，其准確率很難得到較大提高。

Knn的例子到此告以段落。

來自為知筆記(Wiz)

附件列表

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 py4CV例子2汽車檢測和svm算法 py4CV例子3Mnist識別和ANN tensorflow實現貓狗大戰（分類算法） keras貓狗大戰 pandlepandle+OpenCV+Pyqt+貓狗分類（貓狗大戰）使用VGG模型進行貓狗大戰 PaddlePaddle之貓狗大戰(本地數據集) 深度學習tensorflow之kaggle貓狗大戰實現使用VGG模型做Fine Tune進行貓狗大戰 Keras貓狗大戰二：加載模型預測單張圖片