When we upload our code to the server to run, we encounter the following problems:
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=50 error=100 : no CUDA-capable device is detected
Traceback (most recent call last):
File "HyperAttentionDTI_main.py", line 185, in <module>
model = AttentionDTI(hp).cuda()
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in cuda
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 201, in _apply
module._apply(fn)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 223, in _apply
param_applied = fn(param)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.7/site-packages/torch/cuda/__init__.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50
It’s because our graphics card settings are wrong, because we know whether we have a graphics card or not, otherwise we won’t report the wrong graphics card problem
Solution:
Let’s look at the program code we executed and check the CUDA part. I set my graphics card to 6. We don’t have so many graphics cards to use. I checked the location of my graphics card, which is No. 0. Therefore, we set the first line of the following code to 0!
os.environ["CUDA_VISIBLE_DEVICES"] = "6"
if __name__ == "__main__":
"""select seed"""
SEED = 1234
random.seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed_all(SEED)
# torch.backends.cudnn.deterministic = True