一. NTU論文中的預處理方法
We translate them to the body coordinate system with its origin on the “middle of the spine” joint (number 2 in Figure 1), followed by a 3D rotation to fix the X axis parallel to the 3D vector from “right shoulder” to “left shoulder”, and Y axis towards the 3D vector from “spine base” to “spine”. The Z axis is fixed as the new X × Y. In the last step of normalization, we scale all the 3D points based on the distance between “spine base” and “spine” joints. In the cases of having more than one body in the scene, we transform all of them with regard to the main actor’s skeleton.
總結就是,每個視頻分別處理:
- 以“middle of the spine”為原點;
- 改變xyz坐標軸;
- 用“spine base” 到 “spine”的距離來normalization。
二. HCN論文中的預處理方法
該方法來自論文2018IJCAI-Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation.
該論文用卷積的方法來處理骨架數據,它對骨架的預處理方法,以NTU骨架數據集為例就是,所有視頻同時處理:
- 把所有骨架數據變為一個5維數組,每個視頻長度為300幀,不夠300幀的視頻在后面補零;
- 在所有骨架數據中分別找出XYZ的最大最小值,然后用最大最小值歸一化。
代碼實現鏈接:https://github.com/huguyuehuhu/HCN-pytorch/blob/master/feeder/feeder.py