在做SLAM時,希望用到深度圖來輔助生成場景,所以要構建立體視覺,在這里使用OpenCV的Stereo庫和python來進行雙目立體視覺的圖像處理。
- 立體標定
- 應用標定數據
- 轉換成深度圖
標定
在開始之前,需要准備的當然是兩個攝相頭,根據你的需求將兩個攝像頭進行相對位置的固定,我是按平行來進行固定的(如果為了追求兩個雙目圖像更高的生命度,也可以將其按一定鈍角固定,這樣做又限制了場景深度的擴展,根據實際需求選擇)
由於攝像頭目前是我們手動進行定位的,我們現在還不知道兩張圖像與世界坐標之間的耦合關系,所以下一步要進行的是標定,用來確定分別獲取兩個攝像頭的內部參數,並且根據兩個攝像頭在同一個世界坐標下的標定參數來獲取立體參數。注:不要使用OpenCV自帶的自動calbration,其對棋盤的識別率極低,使用Matlab的Camera Calibration Toolbox更為有效,具體細節請看:攝像機標定和立體標定
同時從兩個攝像頭獲取圖片
import cv2
import time
AUTO = True # 自動拍照,或手動按s鍵拍照
INTERVAL = 2 # 自動拍照間隔
cv2.namedWindow("left")
cv2.namedWindow("right")
cv2.moveWindow("left", 0, 0)
cv2.moveWindow("right", 400, 0)
left_camera = cv2.VideoCapture(0)
right_camera = cv2.VideoCapture(1)
counter = 0
utc = time.time()
pattern = (12, 8) # 棋盤格尺寸
folder = "./snapshot/" # 拍照文件目錄
def shot(pos, frame):
global counter
path = folder + pos + "_" + str(counter) + ".jpg"
cv2.imwrite(path, frame)
print("snapshot saved into: " + path)
while True:
ret, left_frame = left_camera.read()
ret, right_frame = right_camera.read()
cv2.imshow("left", left_frame)
cv2.imshow("right", right_frame)
now = time.time()
if AUTO and now - utc >= INTERVAL:
shot("left", left_frame)
shot("right", right_frame)
counter += 1
utc = now
key = cv2.waitKey(1)
if key == ord("q"):
break
elif key == ord("s"):
shot("left", left_frame)
shot("right", right_frame)
counter += 1
left_camera.release()
right_camera.release()
cv2.destroyWindow("left")
cv2.destroyWindow("right")
下面是我拍攝的樣本之一,可以肉眼看出來這兩個攝像頭成像都不是水平的,這更是需要標定的存在的意義
在進行標定的過程中,要注意的是在上面標定方法中沒有提到的是,單個標定后,要對標定的數據進行錯誤分析(Analyse Error),如左圖,是我對左攝像頭的標定結果分析。圖中天藍色點明顯與大部分點不聚斂,所以有可能是標定時對這個圖片標定出現的錯誤,要重新標定,在該點上點擊並獲取其圖片名稱索引,對其重新標定后,右圖的結果看起來還是比較滿意的
在進行完立體標定后,我們將得到如下的數據:
Stereo calibration parameters after optimization:
Intrinsic parameters of left camera:
Focal Length: fc_left = [ 824.93564 825.93598 ] [ 8.21112 8.53492 ]
Principal point: cc_left = [ 251.64723 286.58058 ] [ 13.92642 9.11583 ]
Skew: alpha_c_left = [ 0.00000 ] [ 0.00000 ] => angle of pixel axes = 90.00000 0.00000 degrees
Distortion: kc_left = [ 0.23233 -0.99375 0.00160 0.00145 0.00000 ] [ 0.05659 0.30408 0.00472 0.00925 0.00000 ]
Intrinsic parameters of right camera:
Focal Length: fc_right = [ 853.66485 852.95574 ] [ 8.76773 9.19051 ]
Principal point: cc_right = [ 217.00856 269.37140 ] [ 10.40940 9.47786 ]
Skew: alpha_c_right = [ 0.00000 ] [ 0.00000 ] => angle of pixel axes = 90.00000 0.00000 degrees
Distortion: kc_right = [ 0.30829 -1.61541 0.01495 -0.00758 0.00000 ] [ 0.06567 0.55294 0.00547 0.00641 0.00000 ]
Extrinsic parameters (position of right camera wrt left camera):
Rotation vector: om = [ 0.01911 0.03125 -0.00960 ] [ 0.01261 0.01739 0.00112 ]
Translation vector: T = [ -70.59612 -2.60704 18.87635 ] [ 0.95533 0.79030 5.25024 ]
應用標定數據
我們使用如下的代碼來將其配置到python中,上面的參數都是手動填寫至下面的內容中的,這樣免去保存成文件再去讀取,在托運填寫的時候要注意數據的對應位置。
# filename: camera_configs.py
import cv2
import numpy as np
left_camera_matrix = np.array([[824.93564, 0., 251.64723],
[0., 825.93598, 286.58058],
[0., 0., 1.]])
left_distortion = np.array([[0.23233, -0.99375, 0.00160, 0.00145, 0.00000]])
right_camera_matrix = np.array([[853.66485, 0., 217.00856],
[0., 852.95574, 269.37140],
[0., 0., 1.]])
right_distortion = np.array([[0.30829, -1.61541, 0.01495, -0.00758, 0.00000]])
om = np.array([0.01911, 0.03125, -0.00960]) # 旋轉關系向量
R = cv2.Rodrigues(om)[0] # 使用Rodrigues變換將om變換為R
T = np.array([-70.59612, -2.60704, 18.87635]) # 平移關系向量
size = (640, 480) # 圖像尺寸
# 進行立體更正
R1, R2, P1, P2, Q, validPixROI1, validPixROI2 = cv2.stereoRectify(left_camera_matrix, left_distortion,
right_camera_matrix, right_distortion, size, R,
T)
# 計算更正map
left_map1, left_map2 = cv2.initUndistortRectifyMap(left_camera_matrix, left_distortion, R1, P1, size, cv2.CV_16SC2)
right_map1, right_map2 = cv2.initUndistortRectifyMap(right_camera_matrix, right_distortion, R2, P2, size, cv2.CV_16SC2)
這樣,我們得到了左右攝像頭的兩個map,並得到了立體的Q,這些參數都將應用於下面的轉換成深度圖中
轉換成深度圖
import numpy as np
import cv2
import camera_configs
cv2.namedWindow("left")
cv2.namedWindow("right")
cv2.namedWindow("depth")
cv2.moveWindow("left", 0, 0)
cv2.moveWindow("right", 600, 0)
cv2.createTrackbar("num", "depth", 0, 10, lambda x: None)
cv2.createTrackbar("blockSize", "depth", 5, 255, lambda x: None)
camera1 = cv2.VideoCapture(0)
camera2 = cv2.VideoCapture(1)
# 添加點擊事件,打印當前點的距離
def callbackFunc(e, x, y, f, p):
if e == cv2.EVENT_LBUTTONDOWN:
print threeD[y][x]
cv2.setMouseCallback("depth", callbackFunc, None)
while True:
ret1, frame1 = camera1.read()
ret2, frame2 = camera2.read()
if not ret1 or not ret2:
break
# 根據更正map對圖片進行重構
img1_rectified = cv2.remap(frame1, camera_configs.left_map1, camera_configs.left_map2, cv2.INTER_LINEAR)
img2_rectified = cv2.remap(frame2, camera_configs.right_map1, camera_configs.right_map2, cv2.INTER_LINEAR)
# 將圖片置為灰度圖,為StereoBM作准備
imgL = cv2.cvtColor(img1_rectified, cv2.COLOR_BGR2GRAY)
imgR = cv2.cvtColor(img2_rectified, cv2.COLOR_BGR2GRAY)
# 兩個trackbar用來調節不同的參數查看效果
num = cv2.getTrackbarPos("num", "depth")
blockSize = cv2.getTrackbarPos("blockSize", "depth")
if blockSize % 2 == 0:
blockSize += 1
if blockSize < 5:
blockSize = 5
# 根據Block Maching方法生成差異圖(opencv里也提供了SGBM/Semi-Global Block Matching算法,有興趣可以試試)
stereo = cv2.StereoBM_create(numDisparities=16*num, blockSize=blockSize)
disparity = stereo.compute(imgL, imgR)
disp = cv2.normalize(disparity, disparity, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)
# 將圖片擴展至3d空間中,其z方向的值則為當前的距離
threeD = cv2.reprojectImageTo3D(disparity.astype(np.float32)/16., camera_configs.Q)
cv2.imshow("left", img1_rectified)
cv2.imshow("right", img2_rectified)
cv2.imshow("depth", disp)
key = cv2.waitKey(1)
if key == ord("q"):
break
elif key == ord("s"):
cv2.imwrite("./snapshot/BM_left.jpg", imgL)
cv2.imwrite("./snapshot/BM_right.jpg", imgR)
cv2.imwrite("./snapshot/BM_depth.jpg", disp)
camera1.release()
camera2.release()
cv2.destroyAllWindows()
下面則是一附成像圖,最右側的為生成的disparity圖,按照上面的代碼,在圖上點擊則可以讀取到該點的距離
Have fun.