pointnet++之classification/train.py

本文轉載自查看原文 2019-11-15 21:09 758 pointnet++

1.數據集加載

if FLAGS.normal:
    assert(NUM_POINT<=10000)
    DATA_PATH = os.path.join(ROOT_DIR, 'data/modelnet40_normal_resampled')
    TRAIN_DATASET = modelnet_dataset.ModelNetDataset(root=DATA_PATH, npoints=NUM_POINT, split='train', normal_channel=FLAGS.normal, batch_size=BATCH_SIZE)
    TEST_DATASET = modelnet_dataset.ModelNetDataset(root=DATA_PATH, npoints=NUM_POINT, split='test', normal_channel=FLAGS.normal, batch_size=BATCH_SIZE)
else:
    assert(NUM_POINT<=2048)
    TRAIN_DATASET = modelnet_h5_dataset.ModelNetH5Dataset(os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/train_files.txt'), batch_size=BATCH_SIZE, npoints=NUM_POINT, shuffle=True)
    TEST_DATASET = modelnet_h5_dataset.ModelNetH5Dataset(os.path.join(BASE_DIR, 'data/modelnet40_ply_hdf5_2048/test_files.txt'), batch_size=BATCH_SIZE, npoints=NUM_POINT, shuffle=False)

訓練數據（TRAIN_DATASET）是5個.h5格式的文件：

data/modelnet40_ply_hdf5_2048/ply_data_train0.h5
data/modelnet40_ply_hdf5_2048/ply_data_train1.h5
data/modelnet40_ply_hdf5_2048/ply_data_train2.h5
data/modelnet40_ply_hdf5_2048/ply_data_train3.h5
data/modelnet40_ply_hdf5_2048/ply_data_train4.h5

訓練之前把5個訓練文件的順序打亂：

if self.shuffle: np.random.shuffle(self.file_idxs)

測試數據（TEST_DATASET）是2個.h5格式的文件：

data/modelnet40_ply_hdf5_2048/ply_data_test0.h5
data/modelnet40_ply_hdf5_2048/ply_data_test1.h5

數據集加載的關鍵是對數據集進行分批，2048*2048*3---->16*1024*3,16*1024*3,16*1024*3,...

注：2048個對象順序打亂

modelnet_h5_dataset.py文件：

        data_batch = self.current_data[start_idx:end_idx, 0:self.npoints, :].copy() #這一句是關鍵語句，從一個.h5文件中的順序已經打亂過的2048個對象中取出16個對象，每個對象中從2048個點雲中取出1024個點雲
        label_batch = self.current_label[start_idx:end_idx].copy()

self.npoints=1024

按照順序取1024個點。（按照順序取的這1024個點，居然很均勻，不知道原因何在？）

注：一個對象的1024個點在訓練之前會打亂。

A. 根據2048*2048*3---->16*1024*3，把16*1024*3的前16個對象存入.txt文件，2048*2048*3的前16個對象存入.txt文件，利用CloudCompare對比二者的情況，看下降采樣后和降采樣前的一個點雲對象有什么不同。

B. 以下代碼存入后16個對象。（16*1024*3）

for i in range(data_batch.shape[0]):
    filename=''.join(["/media/dell/D/qcc/code/pointnet/code/pointnet2-master/data/contemporaryfile/train_",str(i),'.txt'])
    np.savetxt(filename, data_batch[i],fmt="%.13f,%.13f,%.13f", delimiter=',')

帶上標簽：

for i in range(data_batch.shape[0]):
    filename=''.join(["/media/dell/D/qcc/code/pointnet/code/pointnet2-master/data/contemporaryfile/train_",str(i),'.txt'])
    traindata_and_label = np.column_stack((data_batch[i], np.ones((1024, 1), dtype=int) * label_batch[i]))  # np.column_stack將兩個矩陣進行組合連接
    np.savetxt(filename, traindata_and_label,fmt="%.13f,%.13f,%.13f,%d", delimiter=',')

C. 以下代碼存入前16個對象。（16*2048*3）

for i in range(16):
    filename=''.join(["/media/dell/D/qcc/code/pointnet/code/pointnet2-master/data/contemporaryfile/initial_train_",str(i),'.txt'])
    np.savetxt(filename, self.current_data[i],fmt="%.13f,%.13f,%.13f", delimiter=',')

帶上標簽：

for i in range(16):
    filename=''.join(["/media/dell/D/qcc/code/pointnet/code/pointnet2-master/data/contemporaryfile/initial_train_",str(i),'.txt'])
    traindata_and_label=np.column_stack((self.current_data[i], np.ones((2048,1),dtype=int)*self.current_label[i]))#np.column_stack將兩個矩陣進行組合連接
    np.savetxt(filename, traindata_and_label,fmt="%.13f,%.13f,%.13f,%d", delimiter=',')

D. 對比。

可以看到，第一個好像是躺椅，第二個是鋼琴，采樣過程暫時還不知道，但是看上去采樣很均勻。

--------------------------------------------------------------------------------

# 在此處考慮制作自己的訓練數據集。 #

--------------------------------------------------------------------------------

每一個.h5訓練或者測試文件中包含2048個對象，每個對象包含2048個點雲，每個點雲包含x、y、z三維坐標。

在訓練之前，會把這2048個對象隨機打亂，當然打亂之后，其對象和標簽仍然是對應的。

制作h5訓練和測試文件的步驟如下：

運行matlab文件：ready_for_make_hdf5.m ，獨立的標線點雲對象寫入文件。
運行Python文件：putfilenamesintofile.py，把訓練和測試的文件的名字存到一個文件中。
運行python文件：make_hdf5_c.py ，制作h5文件。
運行Python文件：putfilenamesintofile.py，把h5文件名字寫到一個文件中。
運行訓練文件：train.py

2.訓練模型的加載

pointnet2_cls_ssg.py

    l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=512, radius=0.2, nsample=32, mlp=[64,64,128], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1', use_nchw=True) #a
    l2_xyz, l2_points, l2_indices = pointnet_sa_module(l1_xyz, l1_points, npoint=128, radius=0.4, nsample=64, mlp=[128,128,256], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer2') #b
    l3_xyz, l3_points, l3_indices = pointnet_sa_module(l2_xyz, l2_points, npoint=None, radius=None, nsample=None, mlp=[256,512,1024], mlp2=None, group_all=True, is_training=is_training, bn_decay=bn_decay, scope='layer3') #c

l0_xyz： (16, 1024, 3)　初始的輸入點雲，16個對象，每個對象有1024個點，每個點有x,y,z三維坐標。

npoint=512：從1024個點中用最遠點采樣方法選出512個質心點。

radius=0.2：采樣的球形鄰域的半徑是0.2m.

nsample=32：每個質心點周圍采樣32個點。

返回值：

l1_xyz：第二層輸入的點雲，(16, 512, 3) 。第一層設置512個中心點，3是每個中心點的三維坐標
l1_points： (16, 512, 128)第一層提取到的local point region的特征, 512個分組（group）,每個group有128維的局部小區域特征
l1_indices：(16, 512, 32)　512個group,每個group 有32個成員，32表示這32 個points 的下標

# Sample and Grouping layer
        if group_all:
            nsample = xyz.get_shape()[1].value
            new_xyz, new_points, idx, grouped_xyz = sample_and_group_all(xyz, points, use_xyz)
        else:
            new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, knn, use_xyz)
        #找到中心點 (new xyz),每個group的局部特征（new points）,每個group對應的下標(idx)

        #new_xyz是最遠點采樣的返回結果： 16*512*3.

　　　　 #idx是球形鄰域（r=0.2m）查詢到的點的索引.

 #grouped_xyz: 16*512*32*3

        # Point Feature Embedding layer
        if use_nchw: new_points = tf.transpose(new_points, [0,3,1,2])
        for i, num_out_channel in enumerate(mlp):
            new_points = tf_util.conv2d(new_points, num_out_channel, [1,1],
                                        padding='VALID', stride=[1,1],
                                        bn=bn, is_training=is_training,
                                        scope='conv%d'%(i), bn_decay=bn_decay,
                                        data_format=data_format) 
        if use_nchw: new_points = tf.transpose(new_points, [0,2,3,1])
        #pointnet層：對 new points 提取特征的卷積層


        # Pooling in Local Regions
        # 對每個group的feature進行pooling,得到每個中心點的local points feature
        if pooling=='max':
            new_points = tf.reduce_max(new_points, axis=[2], keep_dims=True, name='maxpool')
  
        elif pooling=='avg':
            new_points = tf.reduce_mean(new_points, axis=[2], keep_dims=True, name='avgpool')
        elif pooling=='weighted_avg':
            with tf.variable_scope('weighted_avg'):
                dists = tf.norm(grouped_xyz,axis=-1,ord=2,keep_dims=True)
                exp_dists = tf.exp(-dists * 5)
                weights = exp_dists/tf.reduce_sum(exp_dists,axis=2,keep_dims=True) # (batch_size, npoint, nsample, 1)
                new_points *= weights # (batch_size, npoint, nsample, mlp[-1])
                new_points = tf.reduce_sum(new_points, axis=2, keep_dims=True)
        elif pooling=='max_and_avg':
            max_points = tf.reduce_max(new_points, axis=[2], keep_dims=True, name='maxpool')
            avg_points = tf.reduce_mean(new_points, axis=[2], keep_dims=True, name='avgpool')
            new_points = tf.concat([avg_points, max_points], axis=-1)

        # [Optional] Further Processing 
        if mlp2 is not None:
            if use_nchw: new_points = tf.transpose(new_points, [0,3,1,2])
            for i, num_out_channel in enumerate(mlp2):
                new_points = tf_util.conv2d(new_points, num_out_channel, [1,1],
                                            padding='VALID', stride=[1,1],
                                            bn=bn, is_training=is_training,
                                            scope='conv_post_%d'%(i), bn_decay=bn_decay,
                                            data_format=data_format) 
            if use_nchw: new_points = tf.transpose(new_points, [0,2,3,1])

        new_points = tf.squeeze(new_points, [2]) # (batch_size, npoints, mlp2[-1])
        return new_xyz, new_points, idx

這一段帶注釋的代碼參考來源是：https://zhuanlan.zhihu.com/p/57761392

0============================================================0

兩條橫線之間的內容來自：https://zhuanlan.zhihu.com/p/57761392

0============================================================0

SA（set abstraction）層的解釋：

1. 改進特征提取方法：pointnet++使用了分層抽取特征的思想，把每一次叫做set abstraction。分為三部分：采樣層、分組層、特征提取層。首先來看采樣層，為了從稠密的點雲中抽取出一些相對較為重要的中心點，采用FPS（farthest point sampling）最遠點采樣法，這些點並不一定具有語義信息。當然也可以隨機采樣；然后是分組層，在上一層提取出的中心點的某個范圍內尋找最近個k近鄰點組成一個group；特征提取層是將這k個點通過小型pointnet網絡進行卷積和pooling得到的特征作為此中心點的特征，再送入下一個分層繼續。這樣每一層得到的中心點都是上一層中心點的子集，並且隨着層數加深，中心點的個數越來越少，但是每一個中心點包含的信息越來越多。

2. 解決點雲密度不同問題：由於采集時會出現采樣密度不均的問題，所以通過固定范圍選取的固定個數的近鄰點是不合適的。pointnet++提出了兩個解決方案。

2.1. 多尺度分組

如上圖左所示，在每一個分組層都通過多個尺度(設置多個半徑值) 來確定每一個組，並經過 pointnet提取特征之后將多個特征 concat 起來，得到新特征。

2.2. 多分辨率分組

如上圖右所示。左邊特征向量是通過２個set abstraction后得到的，每次set abstraction的半徑不一樣。右邊特征向量是直接對當前層中所有點進行pointnet卷積得到。並且，當點雲密度不均時，可以通過判斷當前patch的密度對左右兩個特征向量給予不同權重。例如，當patch中密度很小，左邊向量得到的信息就沒有對所有patch中點提取的特征可信度更高，於是將右特征向量的權重提高。以此達到減少計算量的同時解決密度問題。

一、分類任務

見網絡下面的那個分支。

分層抽取特征層　set abstraction layer

主要有以下三個部分組成

1. sample layer:　采樣層。得到重要的中心點（使用最遠點采樣）
2. group layer:　分組層。找到距離中心點附近的k個最近點（使用knn），組成local points region
3. pointnet layer:　特征提取層。對每個local points region提取特征

這樣每一層得到的中心點都是上一層中心點的子集，並且隨着層數加深，中心點的個數越來越少，但是每一個中心點包含的信息越來越多。

來看代碼具體實現。這樣的參數設置是SSG(same scale grouping)，作者在論文主要提出的是MSG(multi-scale grouping)，其實只是參數設置的不同。解釋見注釋。

點雲卷積：

輸入：（16,3,512,32）

輸出：（16,64,512,32）

(a): 多尺度分組,不同尺度所提取的局部特征concatenate到一起。

(b): 多分辨率分組，左邊從輸入點雲中（最遠點采樣法）采樣一定個數的質心，右邊在每個質心周圍一定鄰域內采樣一組點（比如32個）。

    # Set abstraction layers  每個模塊中先采樣，找鄰域，然后用三層1*1卷積構成的全連接層進行特征提取，最后做池化，輸出
    # Note: When using NCHW for layer 2, we see increased GPU memory usage (in TF1.4).
    # So we only use NCHW for layer 1 until this issue can be resolved.   總共用了9個mlp層用於特征提取。
    l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=512, radius=0.2, nsample=32, mlp=[64,64,128], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1', use_nchw=True)
    l2_xyz, l2_points, l2_indices = pointnet_sa_module(l1_xyz, l1_points, npoint=128, radius=0.4, nsample=64, mlp=[128,128,256], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer2') #b
    l3_xyz, l3_points, l3_indices = pointnet_sa_module(l2_xyz, l2_points, npoint=None, radius=None, nsample=None, mlp=[256,512,1024], mlp2=None, group_all=True, is_training=is_training, bn_decay=bn_decay, scope='layer3') #c

l2_xyz：(<tf.Tensor 'layer2/GatherPoint:0' shape=(16, 128, 3) dtype=float32>, 16個對象，每個對象選擇128個質心點，每個質心點有x,y,z坐標
l2_points：<tf.Tensor 'layer2/Squeeze:0' shape=(16, 128, 256) dtype=float32>, 16個對象，每個對象選擇128個質心點，256代表局部小區域的特征向量
l2_indices：<tf.Tensor 'layer2/QueryBallPoint:0' shape=(16, 128, 64) dtype=int32>) ， 16個對象，每個對象選擇128個質心點，每個質心點周圍選取64個點雲，64是點雲的索引。

c.
(<tf.Tensor 'layer3/Const:0' shape=(16, 1, 3) dtype=float32>,
<tf.Tensor 'layer3/Squeeze:0' shape=(16, 1, 1024) dtype=float32>,
<tf.Tensor 'layer3/Const_1:0' shape=(16, 1, 128) dtype=int64>)

l3_xyz：16個對象，每個對象選擇1個質心點，每個質心點有x,y,z坐標

l3_points：16個對象，每個對象選擇1個質心點，每個質心點具有1024維特征向量

l3_indices：16個對象，每個對象選擇1個質心點，每個質心點周圍選取128個點雲，128是點雲的索引。

3. 分類的整個過程如下：

點雲卷積的方法（如何由3維變成64維的）：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 pointnet++之scannet/train.py yolov3 train.py pointnet++ PointNet && PointNet++ PointNet、PointNet++和Frustum PointNet py-faster-rcnn代碼閱讀1-train_net.py & train.py pointnet++運行 pointnet++的pytorch實現 PointNet++論文理解和代碼分析 PointNet++作者的視頻講解文字版