Python 圖像處理之圖像拼接

本文轉載自查看原文 2021-04-25 22:19 344 計算機視覺

一、算法目的

　　在同一位置(即圖像的相機位置相同)拍攝兩張以上圖片，這些圖片是單應性相關的，即圖片之間有相同的拍攝區域。基於此將圖片進行縫補，拼成一個大的圖像來創建全景圖像。

二、基本原理

　　要實現兩張圖片的簡單拼接，其實只需找出兩張圖片中相似的點 (至少四個，因為 homography 矩陣的計算需要至少四個點)，計算一張圖片可以變換到另一張圖片的變換矩陣 (homography 單應性矩陣)，用這個矩陣把那張圖片變換后放到另一張圖片相應的位置，就是相當於把兩張圖片中定好的四個相似的點重合在一起。如此，就可以實現簡單的全景拼接。

三、實現步驟

　　1、讀入連續圖片並使用SIFT特征查找匹配對應點對

import sift

featname = ['D:/LearnSrc/PythonSrc/Homework/Lecture05/images'+str(i+1)+'.sift' for i in range(5)] 
imname = ['D:/LearnSrc/PythonSrc/Homework/Lecture05/images'+str(i+1)+'.jpg' for i in range(5)]
l = {}
d = {}
for i in range(5): 
    sift.process_image(imname[i],featname[i])
    l[i],d[i] = sift.read_features_from_file(featname[i])

matches = {}
for i in range(4):
    matches[i] = sift.match(d[i+1],d[i])

　　測試結果：

　　sift算子存在錯誤的匹配點，因此需要用Ransac算法剔除錯誤匹配點。

　　2、利用RANSAC算法計算變換矩陣

　　　　2.1 RANSAC算法
　　　　RANSAC是"RANdom SAmple Consensus"（隨機一致采樣）的縮寫。該方法是用來找到正確模型來擬合帶有噪聲數據的迭代方法。給定一個模型，例如點集之間的單應性矩陣。基本的思想是：數據中包含正確的點和噪聲點，合理的模型應該能夠在描述正確數據點的同時摒棄噪聲點。
　　

　　求解單應性矩陣：

class RansacModel(object):
    def __init__(self,debug=False):
        self.debug = debugdef fit(self, data):
        """ 計算選取四個對應的單應性矩陣 """
        
        # 將其轉置，來調用H_from_points()計算單應性矩陣
        data = data.T
        #映射的起始點
        fp = data[:3,:4]
        # 映射的目標點
        tp = data[3:,:4]
        #計算單應性矩陣然后返回
        return H_from_points(fp,tp)
    
    def get_error( self, data, H):
        """ 對所有的對應計算單應性矩陣，然后對每個變換后的點，返回相應的誤差 """
        data = data.T
        #映射的起始點
        fp = data[:3]
        # 映射的目標點
        tp = data[3:]
        #變換fp
        fp_transformed = dot(H,fp)
        #歸一化齊次坐標
        for i in range(3):
          fp_transformed[i] /= fp_transformed[2]
        return sqrt( sum((tp-fp_transformed)**2,axis=0) )

　　以上可看出，這個類包含fit()方法，僅接受由ransac算法選擇的4個對應點對(data中的前4個點對)，然后擬合一個單應性矩陣。get_error()方法對每個對應點對使用該單應性矩陣，然后返回相應的平方距離之和。因此ransac算法能夠判定正確與錯誤的點。在實際中，還需在距離上使用一個閾值來決定合理的單應性矩陣是哪些。

　　3.拼接圖像：
估計出圖像間的單應性矩陣后(使用RANSAC算法)，需要將所有的圖像扭曲到一個公共的圖像平面上。通常，這里的公共平面為中心圖像平面（否則需要進行大量變形）。一種方法是創建一個很大的圖像，比如圖像中全部填充0，使其和中心圖像平行，然后將所有的圖像扭曲到上面。由於我所有的圖像是由照相機水平旋轉拍攝的，因此可使用一個較簡單的步驟：將中心圖像左邊或右邊的區域填充0，以便為扭曲的圖像騰出空間。
　　代碼：

from array import array
from numpy import dot
from numpy.ma import vstack
from pylab import *
from PIL import Image

# If you have PCV installed, these imports should work
from PCV.geometry import homography, warp
from PCV.localdescriptors import sift

"""
This is the panorama example from section 3.3.
"""

# set paths to data folder
featname = ['D:/LearnSrc/PythonSrc/Homework/Lecture05/images/0' + str(i + 1) + '.sift' for i in range(5)]
imname = ['D:/LearnSrc/PythonSrc/Homework/Lecture05/images/0' + str(i + 1) + '.jpg' for i in range(5)]

# extract features and match
l = {}
d = {}
for i in range(5):
    sift.process_image(imname[i], featname[i])
    l[i], d[i] = sift.read_features_from_file(featname[i])

matches = {}
for i in range(4):
    matches[i] = sift.match(d[i + 1], d[i])

# visualize the matches (Figure 3-11 in the book)
for i in range(4):
    im1 = array(Image.open(imname[i]))
    im2 = array(Image.open(imname[i + 1]))
    figure()
    sift.plot_matches(im2, im1, l[i + 1], l[i], matches[i], show_below=True)


# function to convert the matches to hom. points
def convert_points(j):
    ndx = matches[j].nonzero()[0]
    fp = homography.make_homog(l[j + 1][ndx, :2].T)
    ndx2 = [int(matches[j][i]) for i in ndx]
    tp = homography.make_homog(l[j][ndx2, :2].T)

    # switch x and y - TODO this should move elsewhere
    fp = vstack([fp[1], fp[0], fp[2]])
    tp = vstack([tp[1], tp[0], tp[2]])
    return fp, tp


# estimate the homographies
model = homography.RansacModel()

fp, tp = convert_points(1)
H_12 = homography.H_from_ransac(fp, tp, model)[0]  # im 1 to 2

fp, tp = convert_points(0)
H_01 = homography.H_from_ransac(fp, tp, model)[0]  # im 0 to 1

tp, fp = convert_points(2)  # NB: reverse order
H_32 = homography.H_from_ransac(fp, tp, model)[0]  # im 3 to 2

tp, fp = convert_points(3)  # NB: reverse order
H_43 = homography.H_from_ransac(fp, tp, model)[0]  # im 4 to 3

# warp the images
delta = 2000  # for padding and translation

im1 = array(Image.open(imname[1]), "uint8")
im2 = array(Image.open(imname[2]), "uint8")
im_12 = warp.panorama(H_12, im1, im2, delta, delta)

im1 = array(Image.open(imname[0]), "f")
im_02 = warp.panorama(dot(H_12, H_01), im1, im_12, delta, delta)

im1 = array(Image.open(imname[3]), "f")
im_32 = warp.panorama(H_32, im1, im_02, delta, delta)

im1 = array(Image.open(imname[4]), "f")
im_42 = warp.panorama(dot(H_32, H_43), im1, im_32, delta, 2 * delta)

figure()
imshow(array(im_42, "uint8"))
axis('off')
savefig("example5.png", dpi=300)
show()