pytorch/libtorch qq群: 1041467052
其實pytorch的函數libtorch都有,只是寫法上有些出入。
libtorch的官方文檔鏈接
class tensor
只是官方文檔只是類似與函數申明,沒有告訴干嘛的,只能通過函數名字猜了。比如我要一個一個函數和已知的一個torch::Tensor變量形狀一樣,只是填充指定的數值,我記得在哪里看到過的有個full開頭的函數,然后我就搜素full,然后找到一個函數full_like好像是我需要的。(見0)
- pytorch/libtorch qq群: 1041467052
- 調試技巧:
- CMakeLists.txt
- 0.torch::full_like
- 1.創建與初始化tensor 1.1 torch::rand 1.2 torch::empty 1.3 torch::ones 1.4 torch::Tensor keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device()); 1.5 torch::Tensor num_out = torch::full({ 2,3 }, -2, torch::dtype(torch::kLong));torch::full創建tensor指定形 1.6 torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA); 1.7. torch::full_like(見0)創建一個和已知tensor形狀一樣的 狀並填充指定val的
- 2.拼接tensor torch::cat 以及vector 和cat的融合操作
- 3.torch的切片操作 【select(淺拷貝)】【index_select 深拷貝)】【index 深拷貝】【slice 淺拷貝】 narrow,narrow_copy
- 4.squeeze() unsqueeze()
- 5.torch::nonzero 輸出非0的坐標
- 6.訪問tensor值 a.item()就把1*1 的 tensor的a轉為float
- 7.opencv Mat類型轉tensor 或者其他的vector或者數組數據轉tensor
- 8.tensor 的size sizes() numel()
- 9.torch::sort
- 10.clamp 把數值控制在min max之間,小於min的就為min,大於max的就為max
- 11.大於> 小於< 運算
- 12.轉置Tensor::transpose
- 13.expand_as
- 14.乘 mul_ 除div 減sub_
- 15.加載模型
- 16.模型forward出來的結果
- 17.resize_ zero_
- 18.meshgrid 把tens變成方陣
- 19.flatten 展平tensor
- 20.fill_ tensor填充某個值 就地操作,填充當前tensor
- 21.torch::stack
- 22.reshape
- 23. view
- 24.argmax argmin
- 25.where
- 26.accessor
- 27. torch::max torch::min 同max
- 28.masked_select 與 masked_fill
- 29.libtorch綜合操作1
- 30.pytorch nms <---------> libtorch nms
- 31.數據類型很重要! .to(torch::kByte);
- 32.指針訪問Tensor數據
- 33 PyTorch內Tensor按索引賦值的方法比較
- 44 輸出多個tensor(pytorch端)以及取出多個tensor(libtorch端)
- 45. torch::Tensor作為函數參數,不管是引用還是不引用,函數內部對形參操作都會影響本來的tensor,即都是引用
- 46. 實現pytorch下標神操作
- 47.pytorch libtorch的tensor驗證精度
- 48. 其他--顏色映射
- 49.torch.gather
- 50. torch::argsort(libtorch1.0沒有這個函數) torch::sort
- 51. 判斷tensor是否為空 ind_mask.sizes().empty()
- 52.pytorch代碼 out = aim[ind_mask],用libtorch寫出來。
- 53. pytorch代碼a4 = arr[...,3,0] 用libtorch如何表達出來 masked_select運用!
- 54.再次強調一下類型很重要!!有時候需要強制寫下 kernel = kernel.toType(torch::kByte);
- 小弟不才,同時謝謝友情贊助!
調試技巧:
torch::Tensor box_1 = torch::rand({5,4});
std::cout<<box_1<<std::endl; //可以打印出數值
box_1.print();//可以打印形狀
CMakeLists.txt
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(main)
SET(CMAKE_BUILD_TYPE "Debug")
set(CMAKE_PREFIX_PATH "/data_2/everyday/0429/pytorch/torch")
find_package(Torch REQUIRED)
set(CMAKE_PREFIX_PATH "/home/yhl/software_install/opencv3.2")
find_package(OpenCV REQUIRED)
add_executable(main main.cpp)
target_link_libraries(main "${TORCH_LIBRARIES}")
target_link_libraries(main ${OpenCV_LIBS})
set_property(TARGET main PROPERTY CXX_STANDARD 11)
0.torch::full_like
static Tensor at::full_like(const Tensor &self, Scalar fill_value, const TensorOptions &options = {}, c10::optional
然后就自己試:
#include <iostream>
#include "torch/script.h"
#include "torch/torch.h"
using namespace std;
int main() {
torch::Tensor tmp_1 = torch::rand({2,3});
torch::Tensor tmp_2 = torch::full_like(tmp_1,1);
cout<<tmp_1<<endl;
cout<<tmp_2<<endl;
}
打印的結果如下:
0.8465 0.5771 0.4404
0.9805 0.8665 0.7807
[ Variable[CPUFloatType]{2,3} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]
1.創建與初始化tensor 1.1 torch::rand 1.2 torch::empty 1.3 torch::ones 1.4 torch::Tensor keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device()); 1.5 torch::Tensor num_out = torch::full({ 2,3 }, -2, torch::dtype(torch::kLong));torch::full創建tensor指定形 1.6 torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA); 1.7. torch::full_like(見0)創建一個和已知tensor形狀一樣的 狀並填充指定val的
1.1 torch::rand
torch::Tensor input = torch::rand({ 1,3,2,3 });
(1,1,.,.) =
0.5943 0.4822 0.6663
0.7099 0.0374 0.9833
(1,2,.,.) =
0.4384 0.4567 0.2143
0.3967 0.4999 0.9196
(1,3,.,.) =
0.2467 0.5066 0.8654
0.7873 0.4758 0.3718
[ Variable[CPUFloatType]{1,3,2,3} ]
1.2 torch::empty
torch::Tensor a = torch::empty({2, 4});
std::cout << a << std::endl;
7.0374e+22 5.7886e+22 6.7120e+22 6.7331e+22
6.7120e+22 1.8515e+28 7.3867e+20 9.2358e-01
[ Variable[CPUFloatType]{2,4} ]
1.3 torch::ones
torch::Tensor a = torch::ones({2, 4});
std::cout << a<< std::endl;
1 1 1 1
1 1 1 1
[ Variable[CPUFloatType]{2,4} ]
1.4 torch::zeros
torch::Tensor scores;
torch::Tensor keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device());
1.5 torch::full
inline at::Tensor full(at::IntArrayRef size, at::Scalar fill_value, c10::optional
inline at::Tensor full(at::IntArrayRef size, at::Scalar fill_value, const at::TensorOptions & options = {})
torch::Tensor num_out = torch::full({ 2,3 }, -2, torch::dtype(torch::kLong));
std::cout<<num_out<<std::endl;
1.6 torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA);
torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA);
std::cout<<a<<std::endl;
-8 -8
-8 -8
-8 -8
[ Variable[CUDAFloatType]{3,2} ]
2.拼接tensor torch::cat 以及vector 和cat的融合操作
2.1 按列拼接
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = torch::rand({2,1});
torch::Tensor cat_1 = torch::cat({a,b},1);//按列拼接--》》前提是行數需要一樣
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<cat_1<<std::endl;
0.3551 0.7215 0.3603
0.1188 0.4577 0.2201
[ Variable[CPUFloatType]{2,3} ]
0.5876
0.3040
[ Variable[CPUFloatType]{2,1} ]
0.3551 0.7215 0.3603 0.5876
0.1188 0.4577 0.2201 0.3040
[ Variable[CPUFloatType]{2,4} ]
注意:如果行數不一樣會報如下錯誤
terminate called after throwing an instance of 'std::runtime_error'
what(): invalid argument 0: Sizes of tensors must match except in dimension 1. Got 2 and 4 in dimension 0 at /data_2/everyday/0429/pytorch/aten/src/TH/generic/THTensor.cpp:689
2.2 按行拼接
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = torch::rand({1,3});
torch::Tensor cat_1 = torch::cat({a,b},0);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<cat_1<<std::endl;
0.0004 0.7852 0.4586
0.1612 0.6524 0.7655
[ Variable[CPUFloatType]{2,3} ]
0.5999 0.5445 0.2152
[ Variable[CPUFloatType]{1,3} ]
0.0004 0.7852 0.4586
0.1612 0.6524 0.7655
0.5999 0.5445 0.2152
[ Variable[CPUFloatType]{3,3} ]
2.3 其他例子
torch::Tensor box_1 = torch::rand({5,4});
torch::Tensor score_1 = torch::rand({5,1});
torch::Tensor label_1 = torch::rand({5,1});
torch::Tensor result_1 = torch::cat({box_1,score_1,label_1},1);
result_1.print();
[Variable[CPUFloatType] [5, 6]]
2.4 vector 和cat的融合操作
torch::Tensor xs_t0 = xs - wh_0 / 2;
torch::Tensor ys_t0 = ys - wh_1 / 2;
torch::Tensor xs_t1 = xs + wh_0 / 2;
torch::Tensor ys_t1 = ys + wh_1 / 2;
xs_t0.print();
ys_t0.print();
xs_t1.print();
ys_t1.print();
vector<torch::Tensor> abce = {xs_t0,ys_t0,xs_t1,ys_t1};
torch::Tensor bboxes = torch::cat(abce,2);
std::cout<<"-----cat shape---"<<std::endl;
bboxes.print();
while(1);
打印如下:
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 4]]
-----cat shape---
也可以一句話搞定:
torch::Tensor bboxes = torch::cat({xs_t0,ys_t0,xs_t1,ys_t1},2);
3.torch的切片操作 【select(淺拷貝)】【index_select 深拷貝)】【index 深拷貝】【slice 淺拷貝】 narrow,narrow_copy
select【淺拷貝】只能指定取某一行或某一列
index【深拷貝】只能指定取某一行
index_select【深拷貝】可以按行或按列,指定多行或多列
slice【淺拷貝】 連續的行或列
narrow,narrow_copy
當是淺拷貝,又不想影響之前的結果的時候,可以加個clone(),比如:
torch::Tensor x1 = boxes.select(1,0).clone();
3.1 inline Tensor Tensor::select(int64_t dim, int64_t index) ;好像只能整2維的。第一個參數是維度,0是取行,1是取 列,第二個參數是索引的序號
3.1.1 select//按行取
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.select(0,1);//按行取
std::cout<<b<<std::endl;
0.6201 0.7021 0.1975
0.3080 0.6304 0.1558
[ Variable[CPUFloatType]{2,3} ]
0.3080
0.6304
0.1558
[ Variable[CPUFloatType]{3} ]
3.1.2 select//按列取
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.select(1,1);
std::cout<<b<<std::endl;
0.8295 0.9871 0.1287
0.8466 0.7719 0.2354
[ Variable[CPUFloatType]{2,3} ]
0.9871
0.7719
[ Variable[CPUFloatType]{2} ]
注意:這里是淺拷貝,就是改變b,同時a的值也會同樣的改變
3.1.3 select淺拷貝
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.select(1,1);
std::cout<<b<<std::endl;
b[0] = 0.0;
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.0938 0.2861 0.0089
0.3481 0.5806 0.3711
[ Variable[CPUFloatType]{2,3} ]
0.2861
0.5806
[ Variable[CPUFloatType]{2} ]
0.0938 0.0000 0.0089
0.3481 0.5806 0.3711
[ Variable[CPUFloatType]{2,3} ]
0.0000
0.5806
[ Variable[CPUFloatType]{2} ]
可以看到,b[0] = 0.0;然后a,b的對應位置都為0了。淺拷貝!!
3.2 inline Tensor Tensor::index_select(Dimname dim, const Tensor & index) //同樣的,dim0表示按行,1表示按列 index表示取的行號或者列號,這里
比較奇怪,index一定要是toType(torch::kLong)這種類型的。還有一個奇怪的地方是我准備用數組導入tensor的,發現idx全是0,原因未知
torch::Tensor a = torch::rand({2,6});
std::cout<<a<<std::endl;
slice
torch::Tensor idx = torch::empty({4}).toType(torch::kLong);
idx[0]=0;
idx[1]=2;
idx[2]=4;
idx[3]=1;
// int idx_data[4] = {1,3,2,4};
// torch::Tensor idx = torch::from_blob(idx_data,{4}).toType(torch::kLong);//idx全是0 ?????????????????
std::cout<<idx<<std::endl;
torch::Tensor b = a.index_select(1,idx);
std::cout<<b<<std::endl;
0.4956 0.5028 0.0863 0.9464 0.6714 0.5348
0.3523 0.2245 0.0924 0.7088 0.6913 0.2237
[ Variable[CPUFloatType]{2,6} ]
0
2
4
1
[ Variable[CPULongType]{4} ]
0.4956 0.0863 0.6714 0.5028
0.3523 0.0924 0.6913 0.2245
[ Variable[CPUFloatType]{2,4} ]
3.2.2 index_select深拷貝
torch::Tensor a = torch::rand({2,6});
std::cout<<a<<std::endl;
torch::Tensor idx = torch::empty({4}).toType(torch::kLong);
idx[0]=0;
idx[1]=2;
idx[2]=4;
idx[3]=1;
// int idx_data[4] = {1,3,2,4};
// torch::Tensor idx = torch::from_blob(idx_data,{4}).toType(torch::kLong);
std::cout<<idx<<std::endl;
torch::Tensor b = a.index_select(1,idx);
std::cout<<b<<std::endl;
b[0][0]=0.0;
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.6118 0.6078 0.5052 0.9489 0.6201 0.8975
0.0901 0.2040 0.1452 0.6452 0.9593 0.7454
[ Variable[CPUFloatType]{2,6} ]
0
2
4
1
[ Variable[CPULongType]{4} ]
0.6118 0.5052 0.6201 0.6078
0.0901 0.1452 0.9593 0.2040
[ Variable[CPUFloatType]{2,4} ]
0.6118 0.6078 0.5052 0.9489 0.6201 0.8975
0.0901 0.2040 0.1452 0.6452 0.9593 0.7454
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.5052 0.6201 0.6078
0.0901 0.1452 0.9593 0.2040
[ Variable[CPUFloatType]{2,4} ]
3.3 index inline Tensor Tensor::index(TensorList indices)
這個函數實驗下來,只能按行取,且是深拷貝
torch::Tensor a = torch::rand({2,6});
std::cout<<a<<std::endl;
torch::Tensor idx_1 = torch::empty({2}).toType(torch::kLong);
idx_1[0]=0;
idx_1[1]=1;
torch::Tensor bb = a.index(idx_1);
bb[0][0]=0;
std::cout<<bb<<std::endl;
std::cout<<a<<std::endl;
0.1349 0.8087 0.2659 0.3364 0.0202 0.4498
0.4785 0.4274 0.9348 0.0437 0.6732 0.3174
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.8087 0.2659 0.3364 0.0202 0.4498
0.4785 0.4274 0.9348 0.0437 0.6732 0.3174
[ Variable[CPUFloatType]{2,6} ]
0.1349 0.8087 0.2659 0.3364 0.0202 0.4498
0.4785 0.4274 0.9348 0.0437 0.6732 0.3174
[ Variable[CPUFloatType]{2,6} ]
3.4 slice inline Tensor Tensor::slice(int64_t dim, int64_t start, int64_t end, int64_t step) //dim0表示按行取,1表示按列取,從start開始,到end(不含)結束
可以看到結果,是淺拷貝!!!
torch::Tensor a = torch::rand({2,6});
std::cout<<a<<std::endl;
torch::Tensor b = a.slice(0,0,1);
torch::Tensor c = a.slice(1,0,3);
b[0][0]=0.0;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
std::cout<<a<<std::endl;
0.8270 0.7952 0.3743 0.7992 0.9093 0.5945
0.3764 0.8419 0.7977 0.4150 0.8531 0.9207
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.7952 0.3743 0.7992 0.9093 0.5945
[ Variable[CPUFloatType]{1,6} ]
0.0000 0.7952 0.3743
0.3764 0.8419 0.7977
[ Variable[CPUFloatType]{2,3} ]
0.0000 0.7952 0.3743 0.7992 0.9093 0.5945
0.3764 0.8419 0.7977 0.4150 0.8531 0.9207
[ Variable[CPUFloatType]{2,6} ]
3.5 narrow narrow_copy
inline Tensor Tensor::narrow(int64_t dim, int64_t start, int64_t length) const
inline Tensor Tensor::narrow_copy(int64_t dim, int64_t start, int64_t length) const
torch::Tensor a = torch::rand({4,6});
torch::Tensor b = a.narrow(0,1,2);
torch::Tensor c = a.narrow_copy(0,1,2);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
0.9812 0.4205 0.4169 0.2412 0.8769 0.9873
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
0.5119 0.3880 0.1117 0.5413 0.8203 0.4163
[ Variable[CPUFloatType]{4,6} ]
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
[ Variable[CPUFloatType]{2,6} ]
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
[ Variable[CPUFloatType]{2,6} ]
4.squeeze() unsqueeze()
inline Tensor Tensor::squeeze() const//不加參數的,把所有為1的維度都壓縮
inline Tensor Tensor::squeeze(int64_t dim)const//加參數的,指定哪個維度壓縮
inline Tensor & Tensor::squeeze_() const //暫時不知道啥區別
inline Tensor & Tensor::squeeze_(int64_t dim) const //暫時不知道啥區別
4.1 squeeze()
(1,.,.) =
0.5516 0.6561 0.3603
0.7555 0.1048 0.2016
[ Variable[CPUFloatType]{1,2,3} ]
0.5516 0.6561 0.3603
0.7555 0.1048 0.2016
[ Variable[CPUFloatType]{2,3} ]
(1,.,.) =
0.7675 0.5439 0.5162
(2,.,.) =
0.6103 0.1925 0.1222
[ Variable[CPUFloatType]{2,1,3} ]
0.7675 0.5439 0.5162
0.6103 0.1925 0.1222
[ Variable[CPUFloatType]{2,3} ]
(1,1,.,.) =
0.9875
0.1980
(2,1,.,.) =
0.6973
0.3272
[ Variable[CPUFloatType]{2,1,2,1} ]
0.9875 0.1980
0.6973 0.3272
[ Variable[CPUFloatType]{2,2} ]
4.2 squeeze(int64_t dim) 指定壓縮哪個維度
torch::Tensor a = torch::rand({1,1,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.squeeze();
std::cout<<b<<std::endl;
torch::Tensor c = a.squeeze(0);
std::cout<<c<<std::endl;
torch::Tensor d = a.squeeze(1);
std::cout<<d<<std::endl;
torch::Tensor e = a.squeeze(2);
std::cout<<e<<std::endl;
(1,.,.) =
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,1,3} ]
0.8065
0.1287
0.8073
[ Variable[CPUFloatType]{3} ]
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,3} ]
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,3} ]
(1,.,.) =
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,1,3} ]
4.3. unsqueeze
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.unsqueeze(0);
std::cout<<b<<std::endl;
torch::Tensor bb = a.unsqueeze(1);
std::cout<<bb<<std::endl;
torch::Tensor bbb = a.unsqueeze(2);
std::cout<<bbb<<std::endl;
0.7945 0.0331 0.1666
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{2,3} ]
(1,.,.) =
0.7945 0.0331 0.1666
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{1,2,3} ]
(1,.,.) =
0.7945 0.0331 0.1666
(2,.,.) =
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{2,1,3} ]
(1,.,.) =
0.7945
0.0331
0.1666
(2,.,.) =
0.7821
0.3359
0.0663
[ Variable[CPUFloatType]{2,3,1} ]
5.torch::nonzero 輸出非0的坐標
torch::Tensor a = torch::rand({2,3});
a[0][1] = 0;
a[1][2] = 0;
std::cout<<a<<std::endl;
torch::Tensor b = torch::nonzero(a);
std::cout<<b<<std::endl;
0.4671 0.0000 0.3360
0.9320 0.9246 0.0000
[ Variable[CPUFloatType]{2,3} ]
0 0
0 2
1 0
1 1
[ Variable[CPULongType]{4,2} ]
6.訪問tensor值 a.item
()就把1*1 的 tensor的a轉為float
取出tensor的某個值 為int或者float ===》》》auto bbb = a[1][1].item().toFloat();
一般情況下取出tensor某個值可以直接下標索引即可。比如a[0][1],但是這個值還是tensor類型的,要想為c++的int或者float的,如下:
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
auto bbb = a[1][1].item().toFloat();
std::cout<<bbb<<std::endl;
0.7303 0.6608 0.0024
0.5917 0.0145 0.6472
[ Variable[CPUFloatType]{2,3} ]
0.014509
[ Variable[CPUFloatType]{} ]
0.014509
另外的例子:
torch::Tensor scores = torch::rand({10});
std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 1);
torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());
std::cout<<scores<<std::endl;
std::cout<<v<<std::endl;
std::cout<<idx<<std::endl;
for(int i=0;i<10;i++)
{
int idx_1 = idx[i].item<int>();
float s = v[i].item<float>();
std::cout<<idx_1<<" "<<s<<std::endl;
}
0.1125
0.9524
0.7033
0.3204
0.7907
0.8486
0.7783
0.3215
0.0378
0.7512
[ Variable[CPUFloatType]{10} ]
0.9524
0.8486
0.7907
0.7783
0.7512
0.7033
0.3215
0.3204
0.1125
0.0378
[ Variable[CPUFloatType]{10} ]
1
5
4
6
9
2
7
3
0
8
[ Variable[CPULongType]{10} ]
1 0.952351
5 0.848641
4 0.790685
6 0.778329
9 0.751163
2 0.703278
7 0.32146
3 0.320435
0 0.112517
8 0.0378203
7.opencv Mat類型轉tensor 或者其他的vector或者數組數據轉tensor
7.1
Mat m_out = imread(path);
//[320,320,3]
input_tensor = torch::from_blob(
m_out.data, {m_SIZE_IMAGE, m_SIZE_IMAGE, 3}).toType(torch::kFloat32);//torch::kByte //大坑
//[3,320,320]
input_tensor = input_tensor.permute({2,0,1});
input_tensor = input_tensor.unsqueeze(0);
input_tensor = input_tensor.to(torch::kFloat).to(m_device);
這里需要注意,因為上面圖片被我預處理減均值過的,導致m_out像素值有負數,如果torch::kByte這種格式,會把負數變成正數,所以需要torch::kFloat32類型的。
permute({2,0,1});
之前是opencv Mat是
0 1 2
[320,320,3]
經過了permute({2,0,1}),表示把對應位置換一下,就變成了[3,320,320]
7.2
std::vector<float> region_priors;
//region_priors.push_back(num) region_priors的size是6375 × 4
torch::Tensor m_prior = torch::from_blob(region_priors.data(),{6375,4}).cuda();
8.tensor 的size sizes() numel()
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
auto aa = a.size(0);
auto bb = a.size(1);
auto a_size = a.sizes();
std::cout<<aa<<std::endl;
std::cout<<bb<<std::endl;
std::cout<<a_size<<std::endl;
int num_ = a.numel();
std::cout<<num_<<std::endl;
0.6522 0.0480 0.0009
0.1185 0.4639 0.0386
[ Variable[CPUFloatType]{2,3} ]
2
3
[2, 3]
6
8.2
有個問題就是當torch::Tensor a;直接定義一個tensor的時候,再訪問
torch::Tensor a;
auto a_size = a.sizes();
就會報錯
terminate called after throwing an instance of 'c10::Error'
what(): sizes() called on undefined Tensor (sizes at /data_2/everyday/0429/pytorch/c10/core/UndefinedTensorImpl.cpp:12)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits
frame #1: c10::UndefinedTensorImpl::sizes() const + 0x258 (0x7f83b56362b8 in /data_2/everyday/0429/pytorch/torch/lib/libc10.so)
frame #2: at::Tensor::sizes() const + 0x27 (0x405fc9 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #3: main + 0x30 (0x405d06 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #4: __libc_start_main + 0xf0 (0x7f83b4d12830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: _start + 0x29 (0x405c09 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
程序異常結束。
用numel()就沒有問題
torch::Tensor a;
int num_ = a.numel();
std::cout<<num_<<std::endl;
8.3 獲取維度大小,比如[1,5,8,2],我需要得到維度4
auto aaa = img_poly.sizes();
int len_ = aaa.size();
9.torch::sort
static inline std::tuple<Tensor,Tensor> sort(const Tensor & self, Dimname dim, bool descending)
dim0表示按行,1表示按列
descending=false表示升序,true表示降序
返回的是元組,第一個表示排序后的值,第二個表示排序之后對應之前的索引。
torch::Tensor scores = torch::rand({10});
std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 1);
torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());
std::cout<<scores<<std::endl;
std::cout<<v<<std::endl;
std::cout<<idx<<std::endl;
0.8355
0.1386
0.7910
0.0988
0.2607
0.7810
0.7855
0.5529
0.5846
0.1403
[ Variable[CPUFloatType]{10} ]
0.8355
0.7910
0.7855
0.7810
0.5846
0.5529
0.2607
0.1403
0.1386
0.0988
[ Variable[CPUFloatType]{10} ]
0
2
6
5
8
7
4
9
1
3
[ Variable[CPULongType]{10} ]
10.clamp 把數值控制在min max之間,小於min的就為min,大於max的就為max
inline Tensor Tensor::clamp(c10::optional
torch::Tensor a = torch::rand({2,3});
a[0][0] = 20;
a[0][1] = 21;
a[0][2] = 22;
a[1][0] = 23;
a[1][1] = 24;
std::cout<<a<<std::endl;
torch::Tensor b = a.clamp(21,22);
std::cout<<b<<std::endl;
20.0000 21.0000 22.0000
23.0000 24.0000 0.4792
[ Variable[CPUFloatType]{2,3} ]
21 21 22
22 22 21
[ Variable[CPUFloatType]{2,3} ]
在工程中,一般要取tensor里面的值,還有有時候就只限制一邊,比如只限制min,如下:
xx1 = xx1.clamp(x1[i].item().toFloat(),INT_MAX*1.0);
11.大於> 小於< 運算
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a > 0.5;
std::cout<<b<<std::endl;
0.3526 0.0321 0.7098
0.9794 0.6531 0.9410
[ Variable[CPUFloatType]{2,3} ]
0 0 1
1 1 1
[ Variable[CPUBoolType]{2,3} ]
12.轉置Tensor::transpose
inline Tensor Tensor::transpose(Dimname dim0, Dimname dim1) const
torch::Tensor a = torch::rand({2,3});
std::cout<<a<<std::endl;
torch::Tensor b = a.transpose(1,0);
std::cout<<b<<std::endl;
0.4039 0.3568 0.9978
0.6895 0.7258 0.5576
[ Variable[CPUFloatType]{2,3} ]
0.4039 0.6895
0.3568 0.7258
0.9978 0.5576
[ Variable[CPUFloatType]{3,2} ]
13.expand_as
inline Tensor Tensor::expand_as(const Tensor & other) const
torch::Tensor a = torch::rand({2,3});;
// torch::Tensor b = torch::ones({2,2});
torch::Tensor b = torch::ones({2,1});
torch::Tensor c = b.expand_as(a);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
0.6063 0.4150 0.7665
0.8663 0.9563 0.7461
[ Variable[CPUFloatType]{2,3} ]
1
1
[ Variable[CPUFloatType]{2,1} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]
注意維度有一定要求,我這么寫torch::Tensor b = torch::ones({2,2});torch::Tensor b = torch::ones({2});都會報錯:
terminate called after throwing an instance of 'c10::Error'
what(): The expanded size of the tensor (3) must match the existing size (2) at non-singleton dimension 1. Target sizes: [2, 3]. Tensor sizes: [2, 2] (inferExpandGeometry at /data_2/everyday/0429/pytorch/aten/src/ATen/ExpandUtils.cpp:76)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits
frame #1: at::inferExpandGeometry(c10::ArrayRef
frame #2: at::native::expand(at::Tensor const&, c10::ArrayRef
frame #3:
frame #4:
frame #5:
frame #6:
frame #7:
frame #8: at::native::expand_as(at::Tensor const&, at::Tensor const&) + 0x39 (0x7f6a4a1e4d49 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #9:
frame #10:
frame #11:
frame #12: at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(at::Tensor const&, at::Tensor const&) const + 0xb0 (0x433e06 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #13: at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}::operator()(c10::DispatchTable const&) const + 0x79 (0x432525 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #14: std::result_of<at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1} (c10::DispatchTable const&)>::type c10::LeftRight
c10::DispatchTable::read<at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}>(at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}&&) const + 0x11c (0x4340ba in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #15: at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const + 0x5f (0x4325a5 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #16: at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::OperatorHandle const&, c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const + 0x85 (0x42fd5d in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #17: at::Tensor::expand_as(at::Tensor const&) const + 0x1a5 (0x42ba47 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #18: main + 0xbd (0x427c97 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #19: __libc_start_main + 0xf0 (0x7f6a47ee8830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #20: _start + 0x29 (0x426999 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
14.乘 mul_ 除div 減sub_
boxes_my.select(1,0).mul_(width);
boxes_my.select(1,1).mul_(height);
boxes_my.select(1,2).mul_(width);
boxes_my.select(1,3).mul_(height);
prediction.select(2, 3).div(2);
input_tensor[0][0] = input_tensor[0][0].sub_(0.485).div_(0.229);
input_tensor[0][1] = input_tensor[0][1].sub_(0.456).div_(0.224);
input_tensor[0][2] = input_tensor[0][2].sub_(0.406).div_(0.225);
15.加載模型
torch::Device m_device(torch::kCUDA);
torch::jit::script::Module m_model = torch::jit::load(path_pt);
m_model.to(m_device);
m_model.eval();
16.模型forward出來的結果
當模型有幾個東東輸出來的時候
auto output = m_model.forward({input_tensor});
auto tpl = output.toTuple();
auto arm_loc = tpl->elements()[0].toTensor();
// arm_loc.print();
// std::cout<<arm_loc[0]<<std::endl;
auto arm_conf = tpl->elements()[1].toTensor();
//arm_conf.print();
auto odm_loc = tpl->elements()[2].toTensor();
//odm_loc.print();
// std::cout<<odm_loc[0]<<std::endl;
auto odm_conf = tpl->elements()[3].toTensor();
// odm_conf.print();
17.resize_ zero_
Tensor & resize_(IntArrayRef size) const;
Tensor & zero_() const;
torch::Tensor a = torch::rand({1,3,2,2});
const int batch_size = a.size(0);
const int depth = a.size(1);
const int image_height = a.size(2);
const int image_width = a.size(3);
torch::Tensor crops = torch::rand({1,3,2,2});
// torch::Tensor crops;
crops.resize_({ batch_size, depth, image_height, image_width });
crops.zero_();
std::cout<<a<<std::endl;
std::cout<<crops<<std::endl;
(1,1,.,.) =
0.7889 0.3291
0.2541 0.8283
(1,2,.,.) =
0.0209 0.1846
0.2528 0.2755
(1,3,.,.) =
0.0294 0.6623
0.2736 0.3376
[ Variable[CPUFloatType]{1,3,2,2} ]
(1,1,.,.) =
0 0
0 0
(1,2,.,.) =
0 0
0 0
(1,3,.,.) =
0 0
0 0
[ Variable[CPUFloatType]{1,3,2,2} ]
注意:這里如果只定義 torch::Tensor crops;//torch::Tensor crops = torch::rand({1,3,2,2});就會報錯,感覺還是要先初始化一下才會分配內存,要不然就會報錯!
terminate called after throwing an instance of '
c10::Error'
what(): There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::resize_. This usually means that this function requires a non-empty list of Tensors. Available functions are [CUDATensorId, QuantizedCPUTensorId, CPUTensorId, VariableTensorId] (lookup_ at /data_2/everyday/0429/pytorch/torch/include/ATen/core/dispatch/DispatchTable.h:243)
frame #0: c10::Error::Error(c10::SourceLocation, std::cxx11::basic_string<char, std::char_traits
frame #1: c10::KernelFunction const& c10::DispatchTable::lookup
frame #2: c10::DispatchTable::lookup(c10::TensorTypeId) const + 0x3a (0x42acf4 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #3: at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef
frame #4: std::result_of<at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef
frame #5: at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef
frame #6: at::Tensor& c10::Dispatcher::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef
frame #7: at::Tensor::resize
frame #8: main + 0x134 (0x42798f in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #9: __libc_start_main + 0xf0 (0x7fa2f5618830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #10: _start + 0x29 (0x426719 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
18.meshgrid 把tens變成方陣
static inline std::vector
torch::Tensor scales = torch::ones({2});
torch::Tensor ratios = torch::ones({2});
ratios += 2;
std::cout<<scales<<std::endl;
std::cout<<ratios<<std::endl;
std::vector<torch::Tensor> mesh = torch::meshgrid({ scales, ratios });
torch::Tensor scales_1 = mesh[0];
torch::Tensor ratios_1 = mesh[1];
std::cout<<scales_1<<std::endl;
std::cout<<ratios_1<<std::endl;
1
1
[ Variable[CPUFloatType]{2} ]
3
3
[ Variable[CPUFloatType]{2} ]
1 1
1 1
[ Variable[CPUFloatType]{2,2} ]
3 3
3 3
[ Variable[CPUFloatType]{2,2} ]
19.flatten 展平tensor
Tensor flatten(int64_t start_dim=0, int64_t end_dim=-1) const;
Tensor flatten(int64_t start_dim, int64_t end_dim, Dimname out_dim) const;
Tensor flatten(Dimname start_dim, Dimname end_dim, Dimname out_dim) const;
Tensor flatten(DimnameList dims, Dimname out_dim) const;
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = a.flatten();
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.9953 0.1461 0.0084
0.6169 0.4037 0.7685
[ Variable[CPUFloatType]{2,3} ]
0.9953
0.1461
0.0084
0.6169
0.4037
0.7685
20.fill_ tensor填充某個值 就地操作,填充當前tensor
Tensor & fill_(Scalar value) const;
Tensor & fill_(const Tensor & value) const;
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = a.fill_(4);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
4 4 4
4 4 4
[ Variable[CPUFloatType]{2,3} ]
4 4 4
4 4 4
[ Variable[CPUFloatType]{2,3} ]
21.torch::stack
static inline Tensor stack(TensorList tensors, int64_t dim)
torch::Tensor a = torch::rand({3});
torch::Tensor b = torch::rand({3});
torch::Tensor c = torch::stack({a,b},1);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
0.6776
0.5610
0.2835
[ Variable[CPUFloatType]{3} ]
0.6846
0.3753
0.3873
[ Variable[CPUFloatType]{3} ]
0.6776 0.6846
0.5610 0.3753
0.2835 0.3873
[ Variable[CPUFloatType]{3,2} ]
torch::Tensor a = torch::rand({3});
torch::Tensor b = torch::rand({3});
torch::Tensor c = torch::stack({a,b},0);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
0.7129
0.1650
0.6764
[ Variable[CPUFloatType]{3} ]
0.8035
0.1807
0.8100
[ Variable[CPUFloatType]{3} ]
0.7129 0.1650 0.6764
0.8035 0.1807 0.8100
[ Variable[CPUFloatType]{2,3} ]
22.reshape
inline Tensor Tensor::reshape(IntArrayRef shape) const
torch::Tensor a = torch::rand({2,4});
torch::Tensor b = a.reshape({-1,2});
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.3782 0.6390 0.6919 0.8298
0.3872 0.5923 0.4337 0.9634
[ Variable[CPUFloatType]{2,4} ]
0.3782 0.6390
0.6919 0.8298
0.3872 0.5923
0.4337 0.9634
[ Variable[CPUFloatType]{4,2} ]
23. view
inline Tensor Tensor::view(IntArrayRef size) const
需要先contiguous
a.contiguous().view({-1, 4});
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = a.contiguous().view({ -1, 6 });
torch::Tensor c = a.contiguous().view({ 3, 2 });
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
0.2069 0.8814 0.8506
0.6451 0.0107 0.7591
[ Variable[CPUFloatType]{2,3} ]
0.2069 0.8814 0.8506 0.6451 0.0107 0.7591
[ Variable[CPUFloatType]{1,6} ]
0.2069 0.8814
0.8506 0.6451
0.0107 0.7591
[ Variable[CPUFloatType]{3,2} ]
注意這里和轉置不一樣
24.argmax argmin
static inline Tensor argmax(const Tensor & self, c10::optional<int64_t> dim=c10::nullopt, bool keepdim=false);
static inline Tensor argmin(const Tensor & self, c10::optional<int64_t> dim=c10::nullopt, bool keepdim=false);
torch::Tensor a = torch::rand({2,3});
auto b = torch::argmax(a, 0);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.9337 0.7443 0.1323
0.6514 0.5068 0.5052
[ Variable[CPUFloatType]{2,3} ]
0
0
1
[ Variable[CPULongType]{3} ]
torch::Tensor a = torch::rand({2,3});
auto b = torch::argmax(a, 1);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.0062 0.3846 0.4844
0.9555 0.2844 0.4025
[ Variable[CPUFloatType]{2,3} ]
2
0
[ Variable[CPULongType]{2} ]
25.where
static inline Tensor where(const Tensor & condition, const Tensor & self, const Tensor & other);
static inline std::vector
torch::Tensor d = torch::where(a>0.5,b,c);
說明:在a大於0.5的位置設為pos,d的pos位置上用b的pos位置上面值填充,其余的位置上值是c的值
torch::Tensor a = torch::rand({2,3});
torch::Tensor b = torch::ones({2,3});
torch::Tensor c = torch::zeros({2,3});
torch::Tensor d = torch::where(a>0.5,b,c);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
std::cout<<c<<std::endl;
std::cout<<d<<std::endl;
0.7301 0.8926 0.9570
0.0979 0.5679 0.4473
[ Variable[CPUFloatType]{2,3} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]
0 0 0
0 0 0
[ Variable[CPUFloatType]{2,3} ]
1 1 1
0 1 0
[ Variable[CPUFloatType]{2,3} ]
另外的例子:
auto b = torch::where(a>0.5);
torch::Tensor a = torch::rand({2,3});
auto b = torch::where(a>0.5);
std::cout<<a<<std::endl;
std::cout<<b<<std::endl;
0.3439 0.1622 0.7149
0.4845 0.5982 0.9443
[ Variable[CPUFloatType]{2,3} ]
0
1
1
[ Variable[CPULongType]{3} ]
2
1
2
[ Variable[CPULongType]{3} ]
26.accessor
TensorAccessor<T,N> accessor() const&
auto result_data = result.accessor<float, 2>(); //2代表二維
示例1:
torch::Tensor one = torch::randn({9,6});
auto foo_one=one.accessor<float,2>();
for(int i=0,sum=0;i<foo_one.size(0);i++)
for(int j=0;j<foo_one.size(1);j++)
sum+=foo_one[i][j];
示例2:
torch::Tensor result;
for(int i=1;i<m_num_class;i++)
{
//...
if(0 == result.numel())
{
result = result_.clone();
}else
{
result = torch::cat({result,result_},0);//按行拼接
}
}
result =result.cpu();
auto result_data = result.accessor<float, 2>();
cv::Mat img_draw = img.clone();
for(int i=0;i<result_data.size(0);i++)
{
float score = result_data[i][4];
if(score < 0.4) { continue;}
int x1 = result_data[i][0];
int y1 = result_data[i][1];
int x2 = result_data[i][2];
int y2 = result_data[i][3];
int id_label = result_data[i][5];
cv::rectangle(img_draw,cv::Point(x1,y1),cv::Point(x2,y2),cv::Scalar(255,0,0),3);
cv::putText(img_draw,label_map[id_label],cv::Point(x1,y2),CV_FONT_HERSHEY_SIMPLEX,1,cv::Scalar(255,0,55));
}
27. torch::max torch::min 同max
static inline std::tuple<Tensor,Tensor> max(const Tensor & self, Dimname dim, bool keepdim=false);
static inline Tensor max(const Tensor & self);
torch::Tensor a = torch::rand({4,2});
std::tuple<torch::Tensor, torch::Tensor> max_test = torch::max(a,1);
auto max_val = std::get<0>(max_test);
// index
auto index = std::get<1>(max_test);
std::cout<<a<<std::endl;
std::cout<<max_val<<std::endl;
std::cout<<index<<std::endl;
0.1082 0.7954
0.3099 0.4507
0.2447 0.5169
0.8210 0.3141
[ Variable[CPUFloatType]{4,2} ]
0.7954
0.4507
0.5169
0.8210
[ Variable[CPUFloatType]{4} ]
1
1
1
0
[ Variable[CPULongType]{4} ]
另外一個例子:全局max
torch::Tensor a = torch::rand({4,2});
torch::Tensor max_test = torch::max(a);
std::cout<<a<<std::endl;
std::cout<<max_test<<std::endl;
0.1904 0.9493
0.6521 0.5788
0.9216 0.5997
0.1758 0.7384
[ Variable[CPUFloatType]{4,2} ]
0.94929
[ Variable[CPUFloatType]{} ]
28.masked_select 與 masked_fill
28.1 Tensor masked_select(const Tensor & mask) const;
torch::Tensor a = torch::rand({2,3});
torch::Tensor c = (a>0.25);
torch::Tensor d = a.masked_select(c);
std::cout<<a<<std::endl;
std::cout<<c<<std::endl;
std::cout<<d<<std::endl;
0.0667 0.3812 0.3810
0.3558 0.8628 0.6329
[ Variable[CPUFloatType]{2,3} ]
0 1 1
1 1 1
[ Variable[CPUBoolType]{2,3} ]
0.3812
0.3810
0.3558
0.8628
0.6329
[ Variable[CPUFloatType]{5} ]
28.2 Tensor masked_fill(const Tensor & mask, Scalar value) const;
Tensor & masked_fill_(const Tensor & mask, const Tensor & value) const;
Tensor masked_fill(const Tensor & mask, const Tensor & value) const;
torch::Tensor a = torch::rand({2,3});
torch::Tensor aa = a.clone();
aa.masked_fill_(aa>0.5,-2);
std::cout<<a<<std::endl;
std::cout<<aa<<std::endl;
0.8803 0.2387 0.8577
0.8166 0.0730 0.4682
[ Variable[CPUFloatType]{2,3} ]
-2.0000 0.2387 -2.0000
-2.0000 0.0730 0.4682
[ Variable[CPUFloatType]{2,3} ]
28.3 masked_fill_ 帶下划線的都是就地操作
有個需求是Tensor score表示得分,Tensor label表示標簽,他們都是同大小的。后處理就是當label=26並且label=26的分數小於0.5,那么就把label相應位置置1
float index[] = {3,2,3,3,5,6,7,8,9,10,11,12,13,14,15,16};
float score[] = {0.1,0.1,0.9,0.9,0.9,0.1,0.1,0.1,0.1,0.1,0.8,0.8,0.8,0.8,0.8,0.8};
torch::Tensor aa = torch::from_blob(index, {4,4}).toType(torch::kFloat32);
torch::Tensor bb = torch::from_blob(score, {4,4}).toType(torch::kFloat32);
std::cout<<aa<<std::endl;
std::cout<<bb<<std::endl;
torch::Tensor tmp = (aa == 3);
torch::Tensor tmp_2 = (bb >= 0.9);
std::cout<<tmp<<std::endl;
std::cout<<tmp_2<<std::endl;
torch::Tensor condition_111 = tmp * tmp_2;
std::cout<<condition_111<<std::endl;
aa.masked_fill_(condition_111,-1);
std::cout<<aa<<std::endl;
輸出如下:
3 2 3 3
5 6 7 8
9 10 11 12
13 14 15 16
[ Variable[CPUFloatType]{4,4} ]
0.1000 0.1000 0.9000 0.9000
0.9000 0.1000 0.1000 0.1000
0.1000 0.1000 0.8000 0.8000
0.8000 0.8000 0.8000 0.8000
[ Variable[CPUFloatType]{4,4} ]
1 0 1 1
0 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
0 0 1 1
1 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
0 0 1 1
0 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
3 2 -1 -1
5 6 7 8
9 10 11 12
13 14 15 16
[ Variable[CPUFloatType]{4,4} ]
29.libtorch綜合操作1
torch::jit::script::Module module = torch::jit::load(argv[1]);
std::cout << "== Switch to GPU mode" << std::endl;
// to GPU
module.to(at::kCUDA);
if (LoadImage(file_name, image)) {
auto input_tensor = torch::from_blob(
image.data, {1, kIMAGE_SIZE, kIMAGE_SIZE, kCHANNELS});
input_tensor = input_tensor.permute({0, 3, 1, 2});
input_tensor[0][0] = input_tensor[0][0].sub_(0.485).div_(0.229);
input_tensor[0][1] = input_tensor[0][1].sub_(0.456).div_(0.224);
input_tensor[0][2] = input_tensor[0][2].sub_(0.406).div_(0.225);
// to GPU
input_tensor = input_tensor.to(at::kCUDA);
torch::Tensor out_tensor = module.forward({input_tensor}).toTensor();
auto results = out_tensor.sort(-1, true);
auto softmaxs = std::get<0>(results)[0].softmax(0);
auto indexs = std::get<1>(results)[0];
for (int i = 0; i < kTOP_K; ++i) {
auto idx = indexs[i].item<int>();
std::cout << " ============= Top-" << i + 1
<< " =============" << std::endl;
std::cout << " Label: " << labels[idx] << std::endl;
std::cout << " With Probability: "
<< softmaxs[i].item<float>() * 100.0f << "%" << std::endl;
}
}
30.pytorch nms <---------> libtorch nms
pytorch nms
比如:
boxes [1742,4]
scores [1742]
def nms(boxes, scores, overlap=0.5, top_k=200):
"""Apply non-maximum suppression at test time to avoid detecting too many
overlapping bounding boxes for a given object.
Args:
boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
scores: (tensor) The class predscores for the img, Shape:[num_priors].
overlap: (float) The overlap thresh for suppressing unnecessary boxes.
top_k: (int) The Maximum number of box preds to consider.
Return:
The indices of the kept boxes with respect to num_priors.
"""
keep = scores.new(scores.size(0)).zero_().long()
if boxes.numel() == 0:
return keep
x1 = boxes[:, 0]
y1 = boxes[:, 1]
x2 = boxes[:, 2]
y2 = boxes[:, 3]
area = torch.mul(x2 - x1, y2 - y1)
v, idx = scores.sort(0) # sort in ascending order
# I = I[v >= 0.01]
idx = idx[-top_k:] # indices of the top-k largest vals
xx1 = boxes.new()
yy1 = boxes.new()
xx2 = boxes.new()
yy2 = boxes.new()
w = boxes.new()
h = boxes.new()
# keep = torch.Tensor()
count = 0
while idx.numel() > 0:
i = idx[-1] # index of current largest val
# keep.append(i)
keep[count] = i
count += 1
if idx.size(0) == 1:
break
idx = idx[:-1] # remove kept element from view
# load bboxes of next highest vals
torch.index_select(x1, 0, idx, out=xx1)
torch.index_select(y1, 0, idx, out=yy1)
torch.index_select(x2, 0, idx, out=xx2)
torch.index_select(y2, 0, idx, out=yy2)
# store element-wise max with next highest score
xx1 = torch.clamp(xx1, min=x1[i])
yy1 = torch.clamp(yy1, min=y1[i])
xx2 = torch.clamp(xx2, max=x2[i])
yy2 = torch.clamp(yy2, max=y2[i])
w.resize_as_(xx2)
h.resize_as_(yy2)
w = xx2 - xx1
h = yy2 - yy1
# check sizes of xx1 and xx2.. after each iteration
w = torch.clamp(w, min=0.0)
h = torch.clamp(h, min=0.0)
inter = w*h
# IoU = i / (area(a) + area(b) - i)
rem_areas = torch.index_select(area, 0, idx) # load remaining areas)
union = (rem_areas - inter) + area[i]
IoU = inter/union # store result in iou
# keep only elements with an IoU <= overlap
idx = idx[IoU.le(overlap)]
return keep, count
libtorch nms
bool nms(const torch::Tensor& boxes, const torch::Tensor& scores, torch::Tensor &keep, int &count,float overlap, int top_k)
{
count =0;
keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device());
if(0 == boxes.numel())
{
return false;
}
torch::Tensor x1 = boxes.select(1,0).clone();
torch::Tensor y1 = boxes.select(1,1).clone();
torch::Tensor x2 = boxes.select(1,2).clone();
torch::Tensor y2 = boxes.select(1,3).clone();
torch::Tensor area = (x2-x1)*(y2-y1);
// std::cout<<area<<std::endl;
std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 0);
torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());
int num_ = idx.size(0);
if(num_ > top_k) //python:idx = idx[-top_k:]
{
idx = idx.slice(0,num_-top_k,num_).clone();
}
torch::Tensor xx1,yy1,xx2,yy2,w,h;
while(idx.numel() > 0)
{
auto i = idx[-1];
keep[count] = i;
count += 1;
if(1 == idx.size(0))
{
break;
}
idx = idx.slice(0,0,idx.size(0)-1).clone();
xx1 = x1.index_select(0,idx);
yy1 = y1.index_select(0,idx);
xx2 = x2.index_select(0,idx);
yy2 = y2.index_select(0,idx);
xx1 = xx1.clamp(x1[i].item().toFloat(),INT_MAX*1.0);
yy1 = yy1.clamp(y1[i].item().toFloat(),INT_MAX*1.0);
xx2 = xx2.clamp(INT_MIN*1.0,x2[i].item().toFloat());
yy2 = yy2.clamp(INT_MIN*1.0,y2[i].item().toFloat());
w = xx2 - xx1;
h = yy2 - yy1;
w = w.clamp(0,INT_MAX);
h = h.clamp(0,INT_MAX);
torch::Tensor inter = w * h;
torch::Tensor rem_areas = area.index_select(0,idx);
torch::Tensor union_ = (rem_areas - inter) + area[i];
torch::Tensor Iou = inter * 1.0 / union_;
torch::Tensor index_small = Iou < overlap;
auto mask_idx = torch::nonzero(index_small).squeeze();
idx = idx.index_select(0,mask_idx);//pthon: idx = idx[IoU.le(overlap)]
}
return true;
}
31.數據類型很重要! .to(torch::kByte);
31.1
//[128,512]
torch::Tensor b = torch::argmax(output_1, 2).cpu();
// std::cout<<b<<std::endl;
b.print();
cv::Mat mask(T_height, T_width, CV_8UC1, (uchar*)b.data_ptr());
imshow("mask",mask*255);
waitKey(0);
[Variable[CPULongType] [128, 512]]
如上!得到的b是分割圖[128, 512]。可是死活不能顯示!!然后我檢測b的值和pytorch的對比,發現是一致的。可是上面的就是死活得不到想要的分割圖,全是黑的,為0.可是我把值打出來有不為0的啊!
之前工程也是這么寫的啊,哎。。。然后我就github上面找psenet libtorch的實現,發現人家也是類似的寫法
cv::Mat tempImg = Mat::zeros(T_height, T_width, CV_8UC1);
memcpy((void *) tempImg.data, b.data_ptr(), sizeof(torch::kU8) * b.numel());
我也這么寫,發現還是不行!!!2個小時過去了,沒有辦法,我准備把128*512的數據保存在els里面查看。漫無目的的實驗了一下
cout<<b[0][0].item().toFloat()<<endl;
這樣可以打印出值,一定要加.toFloat()才行。漫無目的的編寫循環
for(int i=0;i<128;i++)
for(int j=0;j<512;j++)
{
}
可是不服啊!哪里有問題呢,值都是對的就是顯示不出來?
發現剛剛上面b[0][0].item().toFloat()必須加.toFloat(),那么我的b是什么類型的呢,是tensor類型的,具體什么類型呢,看到打印的[Variable[CPULongType] [128, 512]],long類型的。
哦,那我轉一下類型看看。翻看之前的轉類型的,發現只需要在tensor后面加.to(torch::kFloat32);類似的
因為我需要int的,我就先int一下,
torch::Tensor b = torch::argmax(output_1, 2).cpu().to(torch::kInt);
試了一下還是不行,
.to(torch::kFloat32); 試了一下還是不行,
我在敲torch::k的時候編譯器會自動彈出k開頭的東西。其中第一個就是kByte.然后試了下:
torch::Tensor b = torch::argmax(output_1, 2).cpu().to(torch::kByte);
!!!!
可以了!出來了我想要的分割圖。
搞死我了,數據類型的問題。至少整了2個小時!
31.2
要把中間處理的圖片轉為tensor
Mat m_tmp = grayMat.clone();
torch::Tensor label_deal = torch::from_blob(
m_tmp.data, {grayMat.rows, grayMat.cols}).toType(torch::kByte).to(m_device);
// label_deal = label_deal.to(m_device);
auto aaa = torch::max(label_deal);
std::cout<<label_deal<<std::endl;
std::cout<<aaa<<std::endl;
while(1);
又是一個大坑啊!!!一開始認為就這么就ok了,然后后面的處理結果不對,就一步步排查哪里出問題,然后定位到這里,m_tmp的像素值在tensor里面壓根就對不上啊!!!我知道m_tmp最大像素值34,可是打出來的tensor最大255!!!哎,是torch::kByte類型啊!沒辦法,再換成kFloat32還是不行,值更離譜還有nan的。。呃呃呃。然后發現.toType(torch::kByte)還有.to(torch::kByte)這個寫法的,到底用哪個還是一樣?然后繼續實驗還是一樣有問題,然后把.to(m_device);單獨拎出來還是不行,因為根據之前的經驗,torch::Tensor tmp = tmp.cpu();好像是需要單獨寫,要不然會有問題。那這邊啥問題呢?像素值就是不能正確放到tensor!!!咋回事呢???
然后郁悶良久,那么Mat的類型是不是也要轉。
Mat m_tmp = grayMat.clone();
m_tmp.convertTo(m_tmp,CV_32FC1);/////又是個大坑 圖片要先轉float32啊
torch::Tensor label_deal = torch::from_blob(
m_tmp.data, {grayMat.rows, grayMat.cols}).toType(torch::kByte).to(m_device);
這樣就可以了!!!呃呃呃,一定要轉CV_32FC1嗎?可能是吧!
32.指針訪問Tensor數據
torch::Tensor output = m_model->forward({input_tensor}).toTensor()[0];
torch::Tensor output_cpu = output.cpu();
//output_cpu Variable[CPUFloatType] [26, 480, 480]]
output_cpu.print();
void *ptr = output_cpu.data_ptr();
//std::cout<<(float*)ptr[0]<<std::endl;
只能用void 或者auto來定義,否則會報錯。比如我用float ptr = output_cpu.data_ptr();會報錯:
error: invalid conversion from ‘void’ to ‘float’ [-fpermissive]
float *ptr = output_cpu.data_ptr();
那么void *編譯通過了,我需要用指針訪問tensor里面的數據啊!
torch::Tensor output = m_model->forward({input_tensor}).toTensor()[0];
torch::Tensor output_cpu = output.cpu();
//output_cpu Variable[CPUFloatType] [26, 480, 480]]
output_cpu.print();
void *ptr = output_cpu.data_ptr();
std::cout<<(float*)ptr<<std::endl;
如上這么寫,輸出:
[Variable[CPUFloatType] [26, 480, 480]]
0x7fab195ee040
輸出來的是個地址,那怎么訪問數據呢,自然而然的就這么寫:
std::cout<<(float)ptr[0]<<std::endl;
這么寫又報錯!!!!
: error: ‘void’ is not a pointer-to-object type,然后又這么寫:
std::cout<<(float*)ptr[0][0][0]<<std::endl;還是報一樣的錯誤!。沒有辦法,然后Google了一下,發現有報錯和我一樣的,以及解決方案:
果真!解決了!
void *ptr = output_cpu.data_ptr();
// std::cout<<*((float*)ptr[0][0][0])<<std::endl;
// std::cout<<(float*)ptr[0][0][0]<<std::endl;
std::cout<<*((float*)(ptr+2))<<std::endl;
還有一種寫法:
const float* result = reinterpret_cast<const float *>(output_cpu.data_ptr());
還有剛剛的那種寫法:
void *ptr = output_cpu.data_ptr();
const float* result = (float*)ptr;
33 PyTorch內Tensor按索引賦值的方法比較
PyTorch內Tensor按索引賦值的方法比較[https://www.jianshu.com/p/e568213c8501]
44 輸出多個tensor(pytorch端)以及取出多個tensor(libtorch端)
pytorch端的輸出:
def forward(self, x, batch=None):
output, cnn_feature = self.dla(x)
return (output['ct_hm'],output['wh'],cnn_feature)
對應的libtorch端
auto out = m_model->forward({input_tensor});
auto tpl = out.toTuple();
auto out_ct_hm = tpl->elements()[0].toTensor();
out_ct_hm.print();
auto out_wh = tpl->elements()[1].toTensor();
out_wh.print();
auto out_cnn_feature = tpl->elements()[2].toTensor();
out_cnn_feature.print();
如果輸出單個tensor,就是
at::Tensor output = module->forward(inputs).toTensor();
45. torch::Tensor作為函數參數,不管是引用還是不引用,函數內部對形參操作都會影響本來的tensor,即都是引用
void test_tensor(torch::Tensor a)
{
a[0][0] = -100;
}
int main(int argc, const char* argv[])
{
torch::Tensor p = torch::rand({2,2});
std::cout<<p<<std::endl;
std::cout<<"~~~~#########~~~~~~~~~~~~~~~~~~~~~~~~~~"<<std::endl;
test_tensor(p);
std::cout<<p<<std::endl;
while (1);
}
輸出如下:
0.0509 0.3509
0.8019 0.1350
[ Variable[CPUType]{2,2} ]
~~~~#########~~~~~~~~~~~~~~~~~~~~~~~~~~
-100.0000 0.3509
0.8019 0.1350
[ Variable[CPUType]{2,2} ]
可以看出,函數void test_tensor(torch::Tensor a),雖然不是引用,但是經過了這個函數之后值改變了!
46. 實現pytorch下標神操作
比如在pytorch端,寫法如下:
c=b[a]
其中,a的形狀是[1,100], b的形狀是[1,100,40,2],所以,大家猜c的形狀是什么。。哦,還有一個已知條件是a相當於一個掩模,a里面的值只有0或者1,假設a的前5個值是1,其余為0
得到的c的形狀是[5,40,2],大概也能猜到就是把為1的那些行取出,其余的不要! 那么,libtorch端如何優雅的實現呢?
呃呃呃,暫時沒有想到什么好法子,因為libtorch端不支持下標操作。。很麻煩。。。然后自己寫的循環實現的:
為了方便看數值,只假設10個。
// aim [1,10,2,2] ind_mask_ [1,10] 比如前5個是1余都是0 得到的結果形狀是[5,40,2] 即pytorch里面的操作 aim = aim[ind_mask]
torch::Tensor deal_mask_index22(torch::Tensor aim_,torch::Tensor ind_mask_)
{
torch::Tensor aim = aim_.clone().squeeze(0);//[1,100,40,2] -->> [100,40,2]
torch::Tensor ind_mask = ind_mask_.clone().squeeze(0);////[1,100] -->> [100]
int row = ind_mask.size(0);
int cnt = 0;
for(int i=0;i<row;i++)
{
if(ind_mask[i].item().toInt())
{
cnt += 1;
}
}
torch::Tensor out = torch::zeros({cnt,aim.size(1),aim.size(2)});
int index_ = 0;
for(int i=0;i<row;i++)
{
if(ind_mask[i].item().toInt())
{
out[index_++] = aim[i];
// std::cout<<i<<std::endl;
}
}
std::cout<<"##############################################"<<std::endl;
std::cout<<out<<std::endl;
return out;
}
int main(int argc, const char* argv[])
{
torch::Tensor ind_mask = torch::ones({1,10});
ind_mask[0][0] = 0;
ind_mask[0][1] = 0;
ind_mask[0][2] = 0;
ind_mask[0][4] = 0;
torch::Tensor aim = torch::rand({1,10,2,2});
std::cout<<aim<<std::endl;
deal_mask_index22(aim,ind_mask);
while (1);
}
47.pytorch libtorch的tensor驗證精度
[pytorch libtorch的tensor驗證精度](pytorch libtorch的tensor驗證精度)
https://www.cnblogs.com/yanghailin/p/13669046.html
48. 其他--顏色映射
/////////////////////////////////////////////////////////////////////////////////////////////////
auto t1 = std::chrono::steady_clock::now();
// static torch::Tensor tensor_m0 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);
// static torch::Tensor tensor_m1 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);
// static torch::Tensor tensor_m2 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);
static torch::Tensor tensor_m0 = torch::zeros({m_height,m_width}).to(torch::kByte);
static torch::Tensor tensor_m1 = torch::zeros({m_height,m_width}).to(torch::kByte);
static torch::Tensor tensor_m2 = torch::zeros({m_height,m_width}).to(torch::kByte);
tensor_m0 = tensor_m0.to(torch::kCUDA);
tensor_m1 = tensor_m1.to(torch::kCUDA);
tensor_m2 = tensor_m2.to(torch::kCUDA);
for(int i=1;i<m_color_cnt;i++)
{
tensor_m0.masked_fill_(index==i,colormap[i * 3]);
tensor_m1.masked_fill_(index==i,colormap[i * 3 + 1]);
tensor_m2.masked_fill_(index==i,colormap[i * 3 + 2]);
}
torch::Tensor tensor_m00 = tensor_m0.cpu();
torch::Tensor tensor_m11 = tensor_m1.cpu();
torch::Tensor tensor_m22 = tensor_m2.cpu();
cv::Mat m0 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m00.data_ptr());
cv::Mat m1 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m11.data_ptr());
cv::Mat m2 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m22.data_ptr());
std::vector<cv::Mat> channels = {m0,m1,m2};
cv::Mat mergeImg;
cv::merge(channels, mergeImg);
mergeImg = mergeImg.clone();
auto ttt1 = std::chrono::duration_cast<std::chrono::milliseconds>
(std::chrono::steady_clock::now() - t1).count();
std::cout << "merge time="<<ttt1<<"ms"<<std::endl;
/////////////////////////////////////////////////////////////////////////////////////////////
用cpu需要35ms左右,gpu2-3ms,下面的代碼實現功能一樣也是2-3ms
auto t0 = std::chrono::steady_clock::now();
for (int i = 0; i<labelMat.rows; i++)
{
for (int j = 0; j<labelMat.cols; j++)
{
int id = labelMat.at<uchar>(i,j);
if(0 == id)
{
continue;
}
colorMat.at<cv::Vec3b>(i, j)[0] = colormap[id * 3];
colorMat.at<cv::Vec3b>(i, j)[1] = colormap[id * 3 + 1];
colorMat.at<cv::Vec3b>(i, j)[2] = colormap[id * 3 + 2];
}
}
auto ttt = std::chrono::duration_cast<std::chrono::milliseconds>
(std::chrono::steady_clock::now() - t0).count();
std::cout << "consume time="<<ttt<<"ms"<<std::endl;
49.torch.gather
純pytorch端的: (轉載於https://www.jianshu.com/p/5d1f8cd5fe31)
torch.gather(input, dim, index, out=None) → Tensor
沿給定軸 dim ,將輸入索引張量 index 指定位置的值進行聚合.
對一個 3 維張量,輸出可以定義為:
out[i][j][k] = input[index[i][j][k]][j][k] # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k] # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]] # if dim == 2
Parameters:
input (Tensor) – 源張量
dim (int) – 索引的軸
index (LongTensor) – 聚合元素的下標(index需要是torch.longTensor類型)
out (Tensor, optional) – 目標張量
例子:
dim = 1
import torch
a = torch.randint(0, 30, (2, 3, 5))
print(a)
#tensor([[[ 18., 5., 7., 1., 1.],
# [ 3., 26., 9., 7., 9.],
# [ 10., 28., 22., 27., 0.]],
# [[ 26., 10., 20., 29., 18.],
# [ 5., 24., 26., 21., 3.],
# [ 10., 29., 10., 0., 22.]]])
index = torch.LongTensor([[[0,1,2,0,2],
[0,0,0,0,0],
[1,1,1,1,1]],
[[1,2,2,2,2],
[0,0,0,0,0],
[2,2,2,2,2]]])
print(a.size()==index.size())
b = torch.gather(a, 1,index)
print(b)
#True
#tensor([[[ 18., 26., 22., 1., 0.],
# [ 18., 5., 7., 1., 1.],
# [ 3., 26., 9., 7., 9.]],
# [[ 5., 29., 10., 0., 22.],
# [ 26., 10., 20., 29., 18.],
# [ 10., 29., 10., 0., 22.]]])
dim =2
c = torch.gather(a, 2,index)
print(c)
#tensor([[[ 18., 5., 7., 18., 7.],
# [ 3., 3., 3., 3., 3.],
# [ 28., 28., 28., 28., 28.]],
# [[ 10., 20., 20., 20., 20.],
# [ 5., 5., 5., 5., 5.],
# [ 10., 10., 10., 10., 10.]]])
dim = 0
index2 = torch.LongTensor([[[0,1,1,0,1],
[0,1,1,1,1],
[1,1,1,1,1]],
[[1,0,0,0,0],
[0,0,0,0,0],
[1,1,0,0,0]]])
d = torch.gather(a, 0,index2)
print(d)
#tensor([[[ 18., 10., 20., 1., 18.],
# [ 3., 24., 26., 21., 3.],
# [ 10., 29., 10., 0., 22.]],
# [[ 26., 5., 7., 1., 1.],
# [ 3., 26., 9., 7., 9.],
# [ 10., 29., 22., 27., 0.]]])
這里我之前看過然后再看到的時候又是一頭霧水,然后記錄在此!主要是這個
out[i][j][k] = input[i][index[i][j][k]][k] # if dim == 1
可是這個gather函數可以干什么呢?直觀上就是output和input的形狀是一樣的,自己推導一兩個看看,比如dim=1
output[0][0][0] = input[0] [index[0][0][0]] [0],然后先查找index找到index[0][0][0]=0,然后再查找input[0][0][0]
流程就是這樣,所以,index是下標索引,其值不能超過dim的維度!
直觀上就是在某個維度整了個新的映射規則得到output,關鍵還在於index!這個就是規則。
50. torch::argsort(libtorch1.0沒有這個函數) torch::sort
用1.1寫好的一個libtorch工程,由於項目是用1.0的,然后把寫好的1.1轉1.0.然后提示說:
error: ‘argsort’ is not a member of ‘torch’
恩,我知道了,就是由於版本問題導致函數名對不上,可是我去哪里找argsort啊,然后,看到之前的max好像有記錄索引的,然后又看到sort,然后實驗了一下,和argsort結果一樣!
//pytorch1.1
torch::Tensor edge_idx_sort2 = torch::argsort(edge_num, 2, true);
//pytorch1.0
std::tupletorch::Tensor,torch::Tensor sort_ret = torch::sort(edge_num, 2, true);
// torch::Tensor v = std::get<0>(sort_ret);
torch::Tensor edge_idx_sort = std::get<1>(sort_ret);
51. 判斷tensor是否為空 ind_mask.sizes().empty()
int row = ind_mask.size(0);
如果ind_mask是空代碼就會奔潰報錯,
terminate called after throwing an instance of 'c10::Error'
what(): dimension specified as 0 but tensor has no dimensions (maybe_wrap_dim at /data_1/leon_develop/pytorch/aten/src/ATen/core/WrapDimMinimal.h:9)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f4cf0a4af5a in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libc10.so)
frame #1: <unknown function> + 0x48a74f (0x7f4d010af74f in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #2: at::native::size(at::Tensor const&, long) + 0x20 (0x7f4d010afac0 in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #3: at::Tensor::size(long) const + 0x36 (0x467fba in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #4: deal_mask_index(at::Tensor, at::Tensor) + 0x1a7 (0x45a83e in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #5: get_gcn_feature(at::Tensor, at::Tensor, at::Tensor, int, int) + 0x4f3 (0x45e092 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #6: init_poly(std::shared_ptr<torch::jit::script::Module> const&, std::shared_ptr<torch::jit::script::Module> const&, at::Tensor const&, std::tuple<at::Tensor, at::Tensor, at::Tensor> const&) + 0x168 (0x45e777 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #7: main + 0xaee (0x463ab5 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #8: __libc_start_main + 0xf0 (0x7f4ced29c840 in /lib/x86_64-linux-gnu/libc.so.6)
frame #9: _start + 0x29 (0x456b89 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
所以,有必要判斷tensor是否為空,可是:
ind_mask.numel() //返回總個數,但是為空的時候返回1
ind_mask.sizes()// 返回類似python list的東東,[1, 100, 40, 2] [1, 40, 2]
ind_mask.sizes()然后我跟到sizes()libtorch函數定義里面是IntList類型的,然后再跟蹤,using IntList = ArrayRef<int64_t>;然后再跟蹤,ArrayRef,然后看這個類,找到
/// empty - Check if the array is empty.
constexpr bool empty() const {
return Length == 0;
}
所以,說明有判斷為空的成員函數可以調用!
if(ind_mask.sizes().empty())
{
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
我也太難了吧!
本來以為搞定了的!
if(ind_mask.sizes().empty())
{
torch::Tensor tmp;
return tmp;
}
當判斷一個tensor為空,我就創建一個tensor退出,因為函數返回是torch::Tensor類型的。
但是直接創建的這個tensor訪問sizes也會報錯!!!
如下:
torch::Tensor tmp;
tmp.print(); //打印[UndefinedTensor]
if(tmp.sizes().empty())
{
}
[UndefinedTensor]
terminate called after throwing an instance of 'c10::Error'
what(): sizes() called on undefined Tensor (sizes at /data_1/leon_develop/pytorch/aten/src/ATen/core/UndefinedTensorImpl.cpp:12)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f35f1b21f5a in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libc10.so)
frame #1: at::UndefinedTensorImpl::sizes() const + 0x77 (0x7f360217d6b7 in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #2: at::Tensor::sizes() const + 0x27 (0x45e921 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #3: main + 0x55 (0x45bcaa in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #4: __libc_start_main + 0xf0 (0x7f35ee373840 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: _start + 0x29 (0x44f889 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
但是這個時候:
torch::Tensor tmp;
tmp.print();
std::cout<<tmp.numel()<<std::endl; // 輸出為0
!!!!
所以,直接定義tensor,這個時候的.numel()為0.
52.pytorch代碼 out = aim[ind_mask],用libtorch寫出來。
pytorch代碼
out = aim[ind_mask]
其中,形狀如下:
aim [21, 40, 2]
ind_mask [21] #元素非0即1,比如有12個1
out輸出形狀是[12,40,2]
#####################################
以上pytorch代碼out = aim[ind_mask]
如何用libtorch代碼表示出來
torch::Tensor a = torch::rand({5,3,2});
torch::Tensor idx = torch::zeros({5}).toType(torch::kLong);
idx[3] = 1;
idx[1] = 1;
torch::Tensor abc = torch::nonzero(idx);
torch::Tensor b = a.index_select(0,abc.squeeze());
std::cout<<a<<std::endl;
std::cout<<abc<<std::endl;
std::cout<<b<<std::endl;
輸出如下:
(1,.,.) =
0.1767 0.8695
0.3779 0.3531
0.3413 0.3734
(2,.,.) =
0.9664 0.7723
0.8640 0.7289
0.8395 0.6344
(3,.,.) =
0.9043 0.2671
0.9901 0.2966
0.0347 0.1650
(4,.,.) =
0.1457 0.1169
0.7983 0.5157
0.6405 0.2213
(5,.,.) =
0.7977 0.4066
0.6691 0.7191
0.5897 0.7400
[ Variable[CPUFloatType]{5,3,2} ]
1
3
[ Variable[CPULongType]{2,1} ]
(1,.,.) =
0.9664 0.7723
0.8640 0.7289
0.8395 0.6344
(2,.,.) =
0.1457 0.1169
0.7983 0.5157
0.6405 0.2213
[ Variable[CPUFloatType]{2,3,2} ]
53. pytorch代碼a4 = arr[...,3,0] 用libtorch如何表達出來 masked_select運用!
>>> import numpy as np
>>> arr = np.arange(40).reshape(1,5,4,2)
>>> arr
array([[[[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7]],
[[ 8, 9],
[10, 11],
[12, 13],
[14, 15]],
[[16, 17],
[18, 19],
[20, 21],
[22, 23]],
[[24, 25],
[26, 27],
[28, 29],
[30, 31]],
[[32, 33],
[34, 35],
[36, 37],
[38, 39]]]])
>>> a1 = arr[...,0,1]
>>> a2 = arr[...,1,0]
>>> a3 = arr[...,2,1]
>>> a4 = arr[...,3,0]
>>> print(a1)
[[ 1 9 17 25 33]]
>>> print(a2)
[[ 2 10 18 26 34]]
>>> print(a3)
[[ 5 13 21 29 37]]
>>> print(a4)
[[ 6 14 22 30 38]]
>>>
一開始折騰好久,好像沒有什么好辦法,然后用for循環完成的,
//ex shape[1,5,4,2] ex[..., 0, 1] -->>[1,5]
torch::Tensor index_tensor_3(const torch::Tensor &ex,const int &idx1,const int &idx2)
{
// ex.print();
int dim_ = ex.size(1);
torch::Tensor out = torch::empty({1,dim_}).to(ex.device());
int size_ = ex.size(1);
for(int i=0;i<size_;i++)
{
auto a = ex[0][i][idx1][idx2];
out[0][i] = a;
// std::cout<<a<<std::endl;
}
return out;
}
然后優化,用純libtorch函數完成:
//ex shape[1,5,4,2] ex[..., 0, 1] -->>[1,5]
torch::Tensor index_tensor_3(const torch::Tensor &ex,const int &idx1,const int &idx2)
{
const int dim0 = ex.size(0);
const int dim1 = ex.size(1);
const int dim2 = ex.size(2);
const int dim3 = ex.size(3);
std::vector<int> v_index(ex.numel());//初始化:ex.numel() 個0
int offset = dim2 * dim3;
for(int i=0;i<dim1;i++)
{
int index_ = idx1 * dim3 + idx2;
v_index[i * offset + index_] = 1;
}
torch::Tensor index = torch::tensor(v_index).to(ex.device());
index = index.reshape(ex.sizes()).toType(torch::kByte);//這里需要kByte類型
// std::cout<<index<<std::endl;
torch::Tensor selete = ex.masked_select(index).unsqueeze(0);
return selete;
}
接上函數,大概累計調用這個函數10次,第一種需要耗時15ms,而下面的耗時5ms
54.再次強調一下類型很重要!!有時候需要強制寫下 kernel = kernel.toType(torch::kByte);
今天一個需求是用libtorch1.8的跑libtorch1.0的pt模型,稍微改改語法,舊版本的就可以在高版本編譯通過,並且可以運行,但是運行的結果不對。這個挺麻煩的。
因為不知道問題出在哪里。首先值得懷疑的是不支持。為了驗證這個問題,就是首先是用高版本的和舊版本輸入都一樣跑推理,看看模型出來的結果是否一致。當然這個也挺費事的,因為pytorch高版本的
需要跑低版本的,需要改挺多東西的。沒辦法,我改了,各種報錯啊,我是psenet,這東東是運行在cuda8,python2.7上面的,不單單是print,還有其他各種各樣的問題,原因在於各種數據處理需要用到各種庫,后來我不管三七二十一全刪了,
因為我發現跑推理就是
out = model(img)
這句話,我只要准備同樣的img就可以了。很長很長的test.py文件就被我濃縮為如下:
#encoding=utf-8
import os
import cv2
import sys
import time
import collections
import torch
import argparse
import numpy as np
import models
#import util
def test(args):
# Setup Model
if args.arch == "resnet50":
model = models.resnet50(pretrained=True, num_classes=7, scale=args.scale)
elif args.arch == "resnet101":
model = models.resnet101(pretrained=True, num_classes=7, scale=args.scale)
elif args.arch == "resnet152":
model = models.resnet152(pretrained=True, num_classes=7, scale=args.scale)
for param in model.parameters():
param.requires_grad = False
model = model.cuda()
if args.resume is not None:
if os.path.isfile(args.resume):
print("Loading model and optimizer from checkpoint '{}'".format(args.resume))
checkpoint = torch.load(args.resume)
# model.load_state_dict(checkpoint['state_dict'])
d = collections.OrderedDict()
for key, value in checkpoint['state_dict'].items():
tmp = key[7:]
d[tmp] = value
model.load_state_dict(d)
print("Loaded checkpoint '{}' (epoch {})"
.format(args.resume, checkpoint['epoch']))
sys.stdout.flush()
else:
print("No checkpoint found at '{}'".format(args.resume))
sys.stdout.flush()
model.eval()
img_tmp = torch.rand(1, 3, 963, 1280).cuda()
traced_script_module = torch.jit.trace(model, img_tmp)
traced_script_module.save("./myfile/22.pt")
init_seed = 1 #設置同樣的種子確保產生一樣的隨機數
torch.manual_seed(init_seed)
torch.cuda.manual_seed(init_seed)
img_tmp = torch.rand(1, 3, 64, 64).cuda()
out = model(img_tmp)
print(img_tmp)
print(out)
print("save pt ok!")
return 1
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Hyperparams')
parser.add_argument('--arch', nargs='?', type=str, default='resnet50')
parser.add_argument('--resume', nargs='?', type=str, default="./myfile/checkpoint.pth.tar",
help='Path to previous saved model to restart from')
parser.add_argument('--binary_th', nargs='?', type=float, default=1.0,
help='Path to previous saved model to restart from')
parser.add_argument('--kernel_num', nargs='?', type=int, default=3,
help='Path to previous saved model to restart from')
parser.add_argument('--scale', nargs='?', type=int, default=1,
help='Path to previous saved model to restart from')
parser.add_argument('--long_size', nargs='?', type=int, default=1280,
help='Path to previous saved model to restart from')
parser.add_argument('--min_kernel_area', nargs='?', type=float, default=10.0,
help='min kernel area')
parser.add_argument('--min_area', nargs='?', type=float, default=300.0,
help='min area')
parser.add_argument('--min_score', nargs='?', type=float, default=0.93,
help='min score')
args = parser.parse_args()
test(args)
這里很重要:
init_seed = 1 #設置同樣的種子確保產生一樣的隨機數
torch.manual_seed(init_seed)
torch.cuda.manual_seed(init_seed)
因為我同時需要在torch1.0和torch1.8上面驗證模型精度,需要控制輸入一樣的,所以設置同樣的種子確保產生一樣的隨機數。print打印出來驗證是一致的。
然后我發現out是有差別的,但是只是小數點后面3位不同,前面幾位一樣,所以我感覺高版本加載低版本權重運行是ok的!但是libtorch里面結果相差很大,為啥呢?
這個就需要仔細看libtorch的代碼了!!!
然后漫無目的的實驗,打印。。這里說一下打印很重要!!!
我先在我的低版本的libtorch打印的部分內容如下:
[ Variable[CPUByteType]{7,703,1280} ]
[Variable[CPUByteType] [7, 703, 1280]]
[Variable[CPUByteType] [3, 703, 1280]]
kernel_size=3
[Variable[CPUByteType] [3, 703, 1280]]
然后高版本的打印的如下:
[CUDAFloatType [1, 7, 703, 1280]]
[CPUFloatType [7, 703, 1280]]
[CPUFloatType [3, 703, 1280]]
kernel_size=3
[CPUFloatType [3, 703, 1280]]
額,看到沒有,數據類型不一樣啊,為啥不一樣啊,所以我就知道了又是哪里數據類型的問題。
然后加了這句話,
kernel = kernel.toType(torch::kByte);
完美解決!
就是一些操作低版本默認是CPUByteType類型,但是到了高版本就是CPUFloatType類型了。
看似簡單的一句話,耗費我大半天!
所以總結起來,上面就是我查找問題的思路流程並且完美解決問題。總結起來就是需要不斷查找定位問題並不斷實驗解決問題。
然后再發一個最近遇到的opencv的Mat的一個數據類型的問題。
Mat convertTo3Channels_2(const Mat& binImg)
{
Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);
vector<Mat> channels;
for (int i=0;i<3;i++)
{
channels.push_back(binImg);
}
merge(channels,three_channel);
three_channel.convertTo(three_channel,CV_8UC3); //重要,還要再寫一次!!
return three_channel;
}
看代碼,
我一開始聲明的CV_8UC3這個類型的, Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);。因為函數傳出去我就是需要uint類型的。
three_channel.convertTo(three_channel,CV_8UC3); //重要,還要再寫一次!!
這里,這里還需要再寫一次,要不然傳出去的不是這個類型的,Mat不知道如何查看或者打印出這個類型,但是我是通過我的調試器gdb imagewatch看這張圖片,下面會顯示類型。
我去復現了一下並截圖了,我在
merge(channels,three_channel);這句話下面打斷點我的gdb imagewatch顯示如下類型
看到沒有,我明明初始化的是Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);
CV_8UC3類型,這個應該是uint類型的,可是merge之后就是float類型的了,可能就是merge這個函數給我改變類型了的吧。
導致函數傳出去的后面的一些操作很奇怪,也不知道問題出在哪里。
然后再強制轉一下就可以。
three_channel.convertTo(three_channel,CV_8UC3); //重要,還要再寫一次!!
總結:
類型很重要
類型很重要
類型很重要
重要的事情說三遍。