Deep Learning on 3D Data

　　Volumetric CNNS/Multi-view CNNS/Spectral CNNS/Feature-based DNNS

Point cloud analysis

Point cloud: N orderless points, each represented by a D dim coorsinate.

　　　　Properties

- Unordered→network needs to be invariant to N! permutations of the input set
- Interaction among points→needs to be able to capture local structures from nearby points
- Invariance under transformations
Properties of a desired neural network on point clouds　　
- Permutation invariance 置換不變性
  - Examples: f(x₁,x₂,...x_n) = max{x₁,x₂,...x_n}

f(x₁,x₂,...x_n)=x₁+x₂+...+x_n　

- Transformation invariance 變換不變性

Permutation invariance: 構造Symmetric function

網絡的結構一般為：特征提取——特征映射——特征圖壓縮（降維）——全連接

　　Observe: is symmetric if g is symmetric. 其中，x代表點雲的某個點，h代表特征提取層，g表示對稱方法，r表示高維特征提取，最后接softmax分類器。

　　PointNet特征提取層是通過MLP實現，g通過maxpooling 來實現。

Q： What symmetric function can be constructed by PointNet?
A：Universal approximation to continuous symmetric functions

Theorem：A Hausdorff continuous symmetric function f : 2^x→R can be arbitrarily approximated by PointNet

PointNet Architecture

Experiment

3D Object Classification

1. ModelNet40 shape classification benchmark: 12,311 CAD models from 40 man-made object categories, split into 9,843 for training and 2,468 for testing.

2. Sample 1024 points and normalize them into a unit sphere.

3. augment the point cloud on-the-fly by randomly rotating the object along the up-axis and jitter the position of each points by Gaussian noise with zero mean and 0.02 standard deviation.

With only fully connected layers and max pooling, PointNet achieves state-of-the-art performance among methods based on 3D input (volumetric and point cloud);

A small gap with Multi-view based method(MVCNN) may be due to the loss of fine geometry details.

3D Object Part Segmentation

1. ShapeNet part data contains 16,881 shapes from categories, annotated with 50 parts in total.

2. Evaluation metric: mIoU

Semantic Segmentation in Scenes

1. Stanford 3D semantic parsing data set

2. Each point is represented by 9-dim vector of XYZ, RGB and normalized location as to the room (from 0 to 1)

代碼分析

T-net：由point independent feature extraction, max pooling, fully connected layers組成

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pointnet PointNet、PointNet++和Frustum PointNet PointNet && PointNet++ pointnet++ PointNet系列 pointNet代碼論文閱讀之PointNet pointnet++運行 PointNet框架理解 PointNet原理詳解

PointNet