前言
模型部署的過程中,不同的硬件可能支持不同的模型框架,本文介紹pytorch模型文件轉換為onnx模型文件的實現過程,主要是基於Pytorch_Unet的實現過程,訓練模型轉換為onnx模型,並測試onnx的效果;
操作步驟
1. 基於訓練完成的pth文件轉換為onnx模型;
2. check和驗證onnx模型;
3. 基於輸入數據測試onnx模型;
實現過程
1. 基於訓練完成的pth文件轉換為onnx模型;
模型是基於Unet網絡構建,基於Carvana數據集進行訓練;
import io import torch import torch.onnx from unet import UNet import onnx import onnxruntime import numpy as np from PIL import Image import torchvision.transforms as transforms from utils.dataset import BasicDataset
轉換過程
def test(): model = UNet(n_channels=3, n_classes=1) batch_size = 1 input_shape = (3, 640, 959) # Initialize model with the pretrained weights map_location = lambda storage, loc: storage if torch.cuda.is_available(): map_location = None loaded_model = torch.load(pthfile, map_location=map_location) model.load_state_dict(loaded_model) # set the model to inference mode model.eval() # data type nchw x = torch.rand(batch_size, *input_shape) input_names = ['input'] output_names = ['output'] # # Export the model torch.onnx.export(model, # model being run x, # model input (or a tuple for multiple inputs) onnxpath, # where to save the model (can be a file or file-like object) export_params=True, # store the trained parameter weights inside the model file opset_version=12, # the ONNX version to export the model to do_constant_folding=True, # whether to execute constant folding for optimization input_names = ['input'], # the model's input names output_names = ['output'], # the model's output names dynamic_axes={'input' : {0 : 'batch_size'}, # variable lenght axes 'output' : {0 : 'batch_size'}})
輸入數據等
pthfile = 'xxx/Pytorch-UNet/checkpoints/CP_epoch5.pth' onnxpath = './unet.onnx' device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
最后會得到onnx模型文件;
注意,模型的輸入大小和測試的輸入數據一致;
注意,在導出模型之前,請先調用torch_model.eval()或torch_model.train(False),以將模型轉換為推理模式,這一點很重要。 這是必需的,因為像dropout或batchnorm這樣的運算符在推斷和訓練模式下的行為會有所不同。
注意,除非指定為動態軸,否則輸入尺寸將在導出的 ONNX 圖中固定為所有輸入尺寸。
在此示例中,我們使用輸入batch_size=1導出模型,但隨后在torch.onnx.export()的dynamic_axes參數中將第一維指定為動態。 因此,導出的模型將接受大小為[batch_size,3, 640, 959]的輸入,其中batch_size可以是可變的。
2. check和驗證onnx模型;
check模型:
onnx.checker.check_model(onnx_model)驗證模型的結構並確認模型具有有效的架構。
通過檢查模型的版本,圖的結構以及節點及其輸入和輸出,可以驗證 ONNX 圖的有效性。如果有效,則輸出為None。
# check model
onnx_model = onnx.load(onnxpath)
check = onnx.checker.check_model(onnx_model)
print('check: ', check)
驗證模型是否匹配:
驗證 ONNX 運行時和 PyTorch 正在為網絡計算相同的值。
# check model whether match
ort_session = onnxruntime.InferenceSession(onnxpath)
# compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name:to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
# compare ONNX Runtime and PyTorch results
torch_out = model(x)
print('tor_out: ', torch_out.shape)
np.testing.assert_allclose(to_numpy(torch_out), ort_outs[0], rtol=1e-03, atol=1e-05)
print("Exported model has been tested with ONNXRuntime, and the result looks good!")
PyTorch 和 ONNX 運行時的輸出在數值上與給定的精度(rtol/ atol)匹配。
注意,測試數據時和模型的輸入大小一致的。
問題,為什么模型的輸出是 ort_outs[0],比模型預想的輸出多出一個維度呢????
驗證轉換前后模型數據是否一致,注意的是模型是否使用正確;
# Initialize model with the pretrained weights map_location = lambda storage, loc: storage if torch.cuda.is_available(): map_location = None loaded_model = torch.load(pthfile, map_location=map_location) model.load_state_dict(loaded_model) # set the model to inference mode model.eval()
3. 基於輸入數據測試onnx模型;
import io import torch import torch.onnx from unet import UNet import onnx import onnxruntime import numpy as np from PIL import Image import torchvision.transforms as transforms from utils.dataset import BasicDataset pthfile = 'xxx/Pytorch-UNet/checkpoints_carvana/CP_epoch5.pth' onnxpath = './unet.onnx' imgpath = 'xxx/Pytorch-UNet/output/0cdf5b5d0ce1_01.jpg' device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() def test_onnx(): full_img = Image.open(imgpath) ort_session = onnxruntime.InferenceSession(onnxpath) scale_factor = 0.5 img = torch.from_numpy(BasicDataset.preprocess(full_img, scale_factor)) img = img.to(device=device, dtype=torch.float32) img.unsqueeze_(0) # ONNX RUNTIME ort_inputs = {ort_session.get_inputs()[0].name:to_numpy(img)} ort_outs = ort_session.run(None, ort_inputs) # list. # post process. img_out = ort_outs[0] img_out = torch.from_numpy(img_out) # probs = torch.nn.functional.softmax(img_out, dim=1) probs = torch.sigmoid(img_out) probs = probs.squeeze(0) tf = transforms.Compose( [ transforms.ToPILImage(), transforms.Resize(full_img.size[1]), transforms.ToTensor() ] ) probs = tf(probs.cpu()) full_mask = probs.squeeze().cpu().numpy() mask_thres = 0.5; mask_out = (full_mask > mask_thres) # save image img_out = Image.fromarray((mask_out*255).astype(np.uint8)) img_out.save('./img/onnx_img.jpg') if __name__ == '__main__': test_onnx()
問題
1. ONNX版本問題;參考here;
File "xxx/miniconda3/envs/open_mmlab/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py", line 80, in _parse_arg raise RuntimeError("Failed to export an ONNX attribute '" + v.node().kind() + RuntimeError: Failed to export an ONNX attribute 'onnx::Cast', since it's not constant, please try to make things (e.g., kernel size) static if possible
注意事項
查了pytorch官方文檔后發現,這里的upsample只支持nearest一種模式,而我用的是bilinear,在改變了這個之后,結果就對的齊了。
建議:先去官方文檔看一下哪些算子支持哪些算子不支持,以及別用Function函數,得用torch.nn里面的層。
我猜想可能是網絡中的某些操作過程在pytorch和onnxruntime中實現不一致吧。。。還沒解決,怎么溯源呢??
參考
2. Pytorch_ONNX_doc;
3. Carvana_dataset;
4. Unet;
5. github_onnx_q;
完
