NVIDIA SMI error after Ubuntu 20.04 restart [How to Solve]

prompt: after the article is written, the directory can be automatically generated. For how to generate it, please refer to the help document on the right </ font> for details

Contents of articles

Problem description, problem analysis and solution 1 Analysis 2 solution


Problem description

after rebooting ubuntu20.04, the resolution of the interface is not correct, and the following error appears when viewing the information of the graphics card: </ font>

> nvidia-smi
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver

before the restart, there was no operation on the graphics card driver, just a system update using sudo apt get upgrade</ font>


Problem analysis and solution

1 Analysis

first of all, make sure that the driver of the graphics card has not moved, and the graphics card and other related hardware has not been replaced. Then, before the restart, the operation for the system is just to update the relevant software in the system by using sudo apt get upgrade. Therefore, it is speculated that the system kernel was updated during the update, and the new system kernel may not match the current graphics driver. Use the following command to view the kernel of the current system and all versions of the kernel that exist in the current system: </ font>

> uname -r
5.4.0-77-generic
> grep menuentry /boot/grub/grub.cfg
if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
  menuentry_id_option=""
export menuentry_id_option
menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-fb3b617b-7620-428a-83de-08de34328e80' {
submenu 'Advanced options for Ubuntu' $menuentry_id_option 'gnulinux-advanced-fb3b617b-7620-428a-83de-08de34328e80' {
	menuentry 'Ubuntu, with Linux 5.4.0-77-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.4.0-77-generic-advanced-fb3b617b-7620-428a-83de-08de34328e80' {
	menuentry 'Ubuntu, with Linux 5.4.0-77-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.4.0-77-generic-recovery-fb3b617b-7620-428a-83de-08de34328e80' {
	menuentry 'Ubuntu, with Linux 5.4.0-74-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.4.0-74-generic-advanced-fb3b617b-7620-428a-83de-08de34328e80' {
	menuentry 'Ubuntu, with Linux 5.4.0-74-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.4.0-74-generic-recovery-fb3b617b-7620-428a-83de-08de34328e80' {
menuentry 'Windows Boot Manager (on /dev/sda3)' --class windows --class os $menuentry_id_option 'osprober-efi-14CC-72D3' {
menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {

found that in addition to the current version of the kernel, there is another (relatively old) version of the kernel: 5.4.0-74-generic
therefore, in order to verify whether it is related to kernel update, restart the system, select Ubuntu advance option in grub startup interface, and start with a lower version kernel. Results after the restart, the resolution of the interface returned to normal, NVIDIA SMI could also find the related information of the graphics card driver</ font>

2 solution

to determine the problem, there are two main solutions: </ font>

Delete and uninstall the old version of the graphics card driver, and re install the graphics card driver on the basis of the new kernel. Continue to use the kernel version 5.4.0-74-generic when installing the graphics card driver, and uninstall the newly updated kernel version 5.4.0-77-generic. Then disable the kernel update and keep the current kernel

> # sudo vim /etc/default/grub
# View the currently installed kernel
> dpkg --get-selections| grep linux-image
linux-image-5.4.0-74-generic install
linux-image-5.4.0-77-generic install
linux-image-generic install
# Uninstall the corresponding kernel
> sudo apt-get remove linux-image-5.4.0-77-generic
# Disable kernel updates and keep the current kernel unupdated
> sudo apt-mark hold linux-image-5.4.0-74-generic 
> sudo apt-mark hold linux-headers-5.4.0-74-generic
> sudo apt-mark hold linux-modules-extra-5.4.0-74-generic

Read More: