nvidia-smi Failed to initialize NVML: Driver/library version mismatch
原因:NVIDIA 內核驅動版本與系統驅動不一致,
# sudo rmmod nvidia
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset nvidia_uvm
首先要知道現在kernel mod 的依賴情況,首先我們從錯誤信息中知道,nvidia_modeset nvidia_uvm 這兩個 mod 依賴於 nvidia, 所以要先卸載他們
# lsmod | grep nvidia
nvidia_uvm 769582 0
nvidia_drm 43547 2
nvidia_modeset 1053327 1 nvidia_drm
nvidia 15764359 2 nvidia_modeset,nvidia_uvm
drm_kms_helper 186531 1 nvidia_drm
drm 456166 5 drm_kms_helper,nvidia_drm
ipmi_msghandler 56728 4 ipmi_ssif,ipmi_devintf,nvidia,ipmi_si
sudo lsof -n -w /dev/nvidia*
這些進程有個了解,如果一會卸載失敗,記得關閉相關進程。
sudo lsof /dev/nvidia* confirm you successfully unload those kmods
sudo rmmod nvidia_drm
rmmod: ERROR: Module nvidia_drm is in use
sudo rmmod nvidia_modeset
rmmod: ERROR: Module nvidia_modeset is in use by: nvidia_drm
sudo rmmod nvidia_uvm
lsmod | grep nvidia
you should get nothing, then confirm you can load the correct driver nvidia-smi