Robotics Perception
Professor Kostas and Jianbo Shi
week 1: camera model
凸透鏡成像原理:凸透鏡焦點與焦距是固定的,這是物理性質。物距u、像距v、焦距f的關系為1/f=1/u+1/v
perspective drawing
bi-perspectograph construction changing
1. change the distance from the objects: OS
2. change the focal length: 兩平行線之間的距離
Vanishing points: 消滅點,所有消滅點的連線為horizon,輔助線為horizon lines
物理世界中平行線映射至圖像坐標系中,產生的交點為消滅點
Vanishing points are intersection points of parallel lines.
Dolly Zoom
拉長物體離相機的距離,同時增大相機焦距
Perspective Transformation
world - camera - image

special cases:
homography: 從3D世界平面(如Z=0)到2D圖像平面,則K[R|t]可簡約為單應性矩陣
only rotation: 純旋轉,t=0,此時齊次坐標最后的1不起作用。
Compute Intrinsics from Vanishing Points

Assignment
雖然coursera的編程作業一向風格是只寫函數內容即可,但其MATLAB接口的含金量相當高,值得學習。
亮點有使用fill函數着色,使用project(本質上為3D-2D矩陣映射)
最后的問題求二元一次方程得到f和pos的公式。
Week 2: Projective Transformation
Vanishing points -- camera orientation
利用消滅點估計旋轉矩陣:消滅點的向量
4個點估計projective transformation的application: Virtual Billboard
p'~Ap 為lambda*p'=Ap
Horizon:與projection plane的法向量正交,同時由image plane的vanishing point產生。

要想清楚vanishing point的定義。
Cross Ratios and Single View Metrology
cross ratio 交比、重比
Assignment
四對匹配點可求一個單應矩陣的理論推導如下:

原因:https://www.coursera.org/learn/robotics-perception/discussions/weeks/2/threads/lO9fEJ9kEeagbRJODex4yg
MATLAB計算SVD的矩陣中,V按列存儲,而我們是按行定義。
inpolygon函數選擇一個二維平面范圍中多邊形頂點內包含的所有點
meshgrid生成一個二維平面范圍內的所有點坐標(遍歷神器)
Week 3: Pose Estimation
Features
Scale Invariant:尺度不變性
DoG: Difference of Gaussian, calculated by multiple Laplace of Gaussian. 計算高斯核函數差
Laplacian Scale Space, depiction of high contrast value.如何生成?
Rotation Invariant:旋轉不變性
Compute Image Gradient 計算圖像梯度
descriptor: histogram of gradient orientation
Singular Value Decomposition
例子:Mondrian色塊圖案

Least Square Estimation


RANSAC

Pose Estimation



Week 4: Multi-View Geometry
https://en.wikipedia.org/wiki/Essential_matrix
It turns out, however, that only one of the four classes of solutions can be realized in practice. Given a pair of corresponding image coordinates, three of the solutions will always produce a 3D point which lies behind at least one of the two cameras and therefore cannot be seen. Only one of the four classes will consistently produce 3D points which are in front of both cameras. This must then be the correct solution. Still, however, it has an undetermined positive scaling related to the translation component.
估計到的結果只有1個是可行解。
