@
在CBIR中,圖像通過其視覺內容(例如顏色,紋理,形狀)來索引。
一、實現原理
首先從圖像數據庫中提取特征並存儲它。然后我們計算與查詢圖像相關的特征。最后,我們檢索具有最近特征的圖像
二、 基於內容的圖像檢索的特征提取
在這篇研究論文中(https://arxiv.org/pdf/1404.1777.pdf),作者證明了為分類目的而訓練的卷積神經網絡(CNN) 可用於提取圖像的“神經代碼”。這
些神經代碼是用於描述圖像的特征。研究表明這種方法在許多數據集.上的表現與最先進的方法一樣。這種方法的問題是我們首先需要標記數據來訓練神經網絡。標簽任務可能是昂貴且耗時的。為我們的圖像檢索任務生成這些“神經代碼”的另一種方法是使用無監督的深度學習算法。這是去噪
自動編碼器的來源。相關代碼可以參見:https://blog.csdn.net/qq_34213260/article/details/106333947.
三、代碼實現
import numpy as np
from keras.models import Model
from keras.datasets import mnist
import cv2
from keras.models import load_model
from sklearn.metrics import label_ranking_average_precision_score
import time
print('Loading mnist dataset')
t0 = time.time()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1)) # adapt this if using `channels_first` image data format
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1)) # adapt this if using `channels_first` image data format
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape)
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
t1 = time.time()
print('mnist dataset loaded in: ', t1-t0)
print('Loading model :')
t0 = time.time()
autoencoder = load_model('autoencoder.h5')
encoder = Model(inputs=autoencoder.input, outputs=autoencoder.get_layer('encoder').output)
t1 = time.time()
print('Model loaded in: ', t1-t0)
def retrieve_closest_images(test_element, test_label, n_samples=10):
learned_codes = encoder.predict(x_train) # 提取數據庫圖像的特征向量
# 轉換成一維向量
learned_codes = learned_codes.reshape(learned_codes.shape[0],
learned_codes.shape[1] * learned_codes.shape[2] * learned_codes.shape[3])
test_code = encoder.predict(np.array([test_element]))
test_code = test_code.reshape(test_code.shape[1] * test_code.shape[2] * test_code.shape[3])
distances = []
# 計算輸入圖像和數據庫所有圖像的距離
for code in learned_codes:
distance = np.linalg.norm(code - test_code)
distances.append(distance)
# 排序取出距離最小的圖像
nb_elements = learned_codes.shape[0]
distances = np.array(distances)
learned_code_index = np.arange(nb_elements)
labels = np.copy(y_train).astype('float32')
labels[labels != test_label] = -1
labels[labels == test_label] = 1
labels[labels == -1] = 0
distance_with_labels = np.stack((distances, labels, learned_code_index), axis=-1)
sorted_distance_with_labels = distance_with_labels[distance_with_labels[:, 0].argsort()]
sorted_distances = 28 - sorted_distance_with_labels[:, 0]
sorted_labels = sorted_distance_with_labels[:, 1]
sorted_indexes = sorted_distance_with_labels[:, 2]
kept_indexes = sorted_indexes[:n_samples]
score = label_ranking_average_precision_score(np.array([sorted_labels[:n_samples]]), np.array([sorted_distances[:n_samples]]))
print("Average precision ranking score for tested element is {}".format(score))
original_image = x_test[0]
cv2.imshow('original_image', original_image)
retrieved_images = x_train[int(kept_indexes[0]), :]
for i in range(1, n_samples):
retrieved_images = np.hstack((retrieved_images, x_train[int(kept_indexes[i]), :]))
cv2.imshow('Results', retrieved_images)
cv2.waitKey(0)
cv2.imwrite('test_results/original_image.jpg', 255 * cv2.resize(original_image, (0,0), fx=3, fy=3))
cv2.imwrite('test_results/retrieved_results.jpg', 255 * cv2.resize(retrieved_images, (0,0), fx=2, fy=2))
# To retrieve closest image
retrieve_closest_images(x_test[0], y_test[0])
打賞
如果對您有幫助,就打賞一下吧O(∩_∩)O