之前寫過一個部署方案,即:基於TensorRT的YOLO(V3\4\5)模型部署《方案一》,以下鏈接
https://www.cnblogs.com/winslam/p/13816143.html
今天嘗試的是另一個開源方案,部署環境和之前一樣(實際上是我懶得改環境,一波三折,好在測試通過)
先把項目下載下來:
https://github.com/wang-xinyu/tensorrtx
打開D:\tensorrtx-master\yolov5 准備用cmake生成vs2017工程,修改CMakeLists.txt文件如下:
1 cmake_minimum_required(VERSION 2.6) 2 3 project(yolov5) # 1 4 set(OpenCV_DIR "D:\\Program Files (x64)\\opencv412\\opencv\\build") #2 # 這里之前使用的是opencv with contrib,報錯,所以改為原版 5 set(TRT_DIR "D:\\TensorRT-7.1.3.4") #3 6 7 add_definitions(-std=c++11) 8 option(CUDA_USE_STATIC_CUDA_RUNTIME OFF) 9 set(CMAKE_CXX_STANDARD 11) 10 set(CMAKE_BUILD_TYPE Debug) 11 12 set(THREADS_PREFER_PTHREAD_FLAG ON) 13 find_package(Threads) 14 15 # setup CUDA 16 find_package(CUDA REQUIRED) 17 message(STATUS " libraries: ${CUDA_LIBRARIES}") 18 message(STATUS " include path: ${CUDA_INCLUDE_DIRS}") 19 20 include_directories(${CUDA_INCLUDE_DIRS}) 21 22 set(CUDA_NVCC_PLAGS ${CUDA_NVCC_PLAGS};-std=c++11; -g; -G;-gencode; arch=compute_75;code=sm_75) 23 #### 24 enable_language(CUDA) # add this line, then no need to setup cuda path in vs 25 #### 26 include_directories(${PROJECT_SOURCE_DIR}/include) 27 include_directories(${TRT_DIR}\\include) 28 29 # -D_MWAITXINTRIN_H_INCLUDED for solving error: identifier "__builtin_ia32_mwaitx" is undefined 30 set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wall -Ofast -D_MWAITXINTRIN_H_INCLUDED") 31 32 # setup opencv 33 find_package(OpenCV QUIET 34 NO_MODULE 35 NO_DEFAULT_PATH 36 NO_CMAKE_PATH 37 NO_CMAKE_ENVIRONMENT_PATH 38 NO_SYSTEM_ENVIRONMENT_PATH 39 NO_CMAKE_PACKAGE_REGISTRY 40 NO_CMAKE_BUILDS_PATH 41 NO_CMAKE_SYSTEM_PATH 42 NO_CMAKE_SYSTEM_PACKAGE_REGISTRY 43 ) 44 45 message(STATUS "OpenCV library status:") 46 message(STATUS " version: ${OpenCV_VERSION}") 47 message(STATUS " libraries: ${OpenCV_LIBS}") 48 message(STATUS " include path: ${OpenCV_INCLUDE_DIRS}") 49 50 include_directories(${OpenCV_INCLUDE_DIRS}) 51 link_directories(${TRT_DIR}\\lib) 52 53 #add_executable(yolov5 ${PROJECT_SOURCE_DIR}/yolov5.cpp ${PROJECT_SOURCE_DIR}/yololayer.cu ${PROJECT_SOURCE_DIR}/yololayer.h 54 # ${PROJECT_SOURCE_DIR}/hardswish.cu ${PROJECT_SOURCE_DIR}/hardswish.h) #4 # 紅色的刪除了,已經改為56行 55 56 add_executable(yolov5 ${PROJECT_SOURCE_DIR}/yolov5.cpp ${PROJECT_SOURCE_DIR}/yololayer.cu ${PROJECT_SOURCE_DIR}/yololayer.h ) #4 57 58 target_link_libraries(yolov5 "nvinfer" "nvinfer_plugin") #5 59 target_link_libraries(yolov5 ${OpenCV_LIBS}) #6 60 target_link_libraries(yolov5 ${CUDA_LIBRARIES}) #7 61 target_link_libraries(yolov5 Threads::Threads) #8
configure、generate之后,用ide打開,載入cuda、tensorrt的屬性表,
在yolov5.cpp中指定具體模型文件路徑(文件應該是原本yolov5模型轉換后的文件?):
std::map<std::string, Weights> weightMap = loadWeights("D:\\tensorrtx-master\\yolov5\\build\\yolov5s.wts");
然后在release x64模式下編譯成功。
(編譯可參考:https://github.com/wang-xinyu/tensorrtx/blob/master/tutorials/run_on_windows.md 記得還有個include頭文件下載)
接下來在powershell上操作
生成“.engin”文件
PS D:\tensorrtx-master\yolov5\build\Release> .\yolov5.exe -s
Loading weights: D:\tensorrtx-master\yolov5\build\yolov5s.wts
Building engine, please wait for a while...
[10/15/2020-16:02:24] [W] [TRT] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
Build engine successfully!
測試,提前將圖片放在新建的samples文件夾下,復制到源碼根目錄:
PS D:\tensorrtx-master\yolov5\build\Release> .\yolov5.exe -d D:\tensorrtx-master\samples
181ms
11ms
11ms
11ms
效果圖在exe所在文件夾。
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
在最后,作者提供了另一個測試環境的方式,順便也是解釋整個模型部署的流程,
鏈接:https://github.com/wang-xinyu/pytorchx
我們將項目下載下來,放在D:\tensorrtx-master下,
進入測試項目:D:\tensorrtx-master\pytorchx-master
進入lenet文件夾
在python環境執行:
python lenet5.py
python inference.py
打印以下內容正常:
接着在C++環境測試(理論輸出結果和上圖一樣)
進入D:\tensorrtx-master\lenet,拷貝其中C++源碼新建工程,我直接建在這里了(懶)
和上篇博客一樣加載cuda、tensorrt的屬性表
將 216行的
std::ofstream p("lenet5.engine");
改為:
std::ofstream p("lenet5.engine", std::ios::binary);
(參考:https://github.com/wang-xinyu/tensorrtx/issues/25)
修改76行,指定具體路徑
std::map<std::string, Weights> weightMap = loadWeights("D:\\tensorrtx-master\\pytorchx-master\\lenet\\lenet5.wts");
指定以下指令
PS D:\tensorrtx-master\yolov5\build\x64\Release> .\lenet.exe -d
Output:
0.0949623, 0.0998472, 0.110072, 0.0975036, 0.0965564, 0.109736, 0.0947979, 0.105618, 0.099228, 0.0916792,
這里的輸出和上圖差不多,是正常的,模型轉換成功
參考:https://github.com/wang-xinyu/tensorrtx/blob/master/tutorials/run_on_windows.md