Pytorch failed to specify GPU resolution

Recently, I ran pytorch’s training code on an 8-card server without any problem. However, after the CUDA is re installed, it is impossible to specify which GPU to run on. It can only be used from Block 0 in order. After checking some information, the problem has been solved.

1. To specify which GPU to run on in Python program, the following methods are usually adopted:

import os
import torch

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7"

Or execute the following commands directly from the command line (not recommended):

export CUDA_VISIBLE_DEVICES=4,5,6,7

2. According to the previous writing method, suddenly the above code is invalid. No matter how to modify the visible GPU number, the final program is used from Block 0 in order. The problem lies in the location of the specified GPU line of code“ os.environ [“CUDA_ VISIBLE_ Devices “] =” 4,5,6,7 “” move to import torch and other codes, followed by import OS, that is, in the following way:

import os

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7"

import torch

3. Some common instructions for viewing GPU information are attached for later use, as follows:

import torch

torch.cuda.is_available()  # Check if cuda is available

torch.cuda.device_count() # Returns the number of GPUs

torch.cuda.get_device_name(0) # Return the GPU name, the device index starts from 0 by default

torch.cuda.current_device() # Returns the current device index

Read More: