1、安裝cuda
https://developer.nvidia.com/zh-cn/cuda-toolkit
以11.2為例,選擇版本后,進行下載安裝:
wget https://developer.download.nvidia.com/compute/cuda/11.5.0/local_installers/cuda_11.5.0_495.29.05_linux_sbsa.run sudo sh cuda_11.5.0_495.29.05_linux_sbsa.run
然后配置環境變量:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
不然使用cmake編譯會報CMAKE_CUDA_COMPILER找不到的錯誤
nvcc -V
檢查安裝是否成功
2、下載cudnn
下載地址:https://developer.nvidia.com/rdp/cudnn-download
這里使用8.1.1
tar zxvf cudnn-11.2-linux-aarch64sbsa-v8.1.1.33.tgz
sudo cp cuda/include/* /usr/local/cudn/include/ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ sudo chmod a+r /usr/local/include/cudnn.h sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
完成后,通過如下命令查看安裝情況,
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
3、下載nccl
https://developer.nvidia.com/nccl
例如下載nccl-<version>.txz
tar xvf nccl-<version>.txz
cp -r nccl-<version> /usr/local/
cp nccl-<version>/include/* /usr/local/include
或者
sudo rpm -i nccl-repo-<version>.rpm
sudo yum update
sudo yum install libnccl libnccl-devel libnccl-static