在docker-ubuntu18.04 上安装 cuda 和 cudnn
其实是安装 nvidia-docker2 然后 pull 已经 安装好 cuda 和 cudnn 的 ubuntu18.04的镜像
环境: Ubuntu18.04 NVIDIA driver build-essential
-
在主机上安装 NVIDIA driver 官方文档
-
可执行文件安装
BASE_URL=https://us.download.nvidia.com/tesla DRIVER_VERSION=450.80.02(需要去选择合适的驱动版本) curl -fSsl -O $BASE_URL/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run等待下载完成
sudo sh NVIDIA-Linux-x86_64-$DRIVER_VERSION.run -
apt 安装
安装linux 内核 的头文件
sudo apt-get install linux-headers-$(uname -r)确保CUDA网络存储库(CUDA network repository)上的包优先于规范存储库(Canonical repository)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g') sudo wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-$distribution.pin sudo mv cuda-$distribution.pin /etc/apt/preferences.d/cuda-repository-pin-600安装 GPG公钥(public GPG key)
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/7fa2af80.pub安装 cuda源仓库(CUDA network repository)
echo "deb http://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64 /" | sudo tee /etc/apt/sources.list.d/cuda.listsudo apt-get update sudo apt-get -y install cuda-drivers
-
-
docker:中安装cuda 和 cudnn
-
Setting up NVIDIA Container Toolkit
安装nvidia-docker2 替换 docker,docker不用卸载。
过程中出现的基本都是网络问题,把curl里面的 -s 去掉 可以看到原因。大多数情况下是DNS解析不了主机地址,访问nvidia被拒绝。能够用代理话可能就不会出现这些,没有代理的话尝试修改 DNS ,浏览器可以访问 https://nvidia.github.io/nvidia-docker 基本就不会有问题了。distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.listTo get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.listInstall the nvidia-docker2 package (and dependencies) after updating the package listing:
sudo apt-get update sudo apt-get install -y nvidia-docker2按y切换配置,将旧的docker切换为可以调用GPU的nvidia-docker2
Restart the Docker daemon to complete the installation after setting the default runtime:
sudo systemctl restart docker到 dockerhub
pull所需要cuda 和 cudnn 版本例如:我需要的环境是 cuda10.0 cudnn7 ubunutu18.04
docker pull nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
进入镜像后 nvidia-smi 显示和下面一样就可以了
不知道自己版本测试用的话直接执行下面的命令就好了
At this point, a working setup can be tested by running a base CUDA container:
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smiThis should result in a console output shown below:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 | | N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
-
