pg = ProcessGroupNCCL(prefix_store, rank, world_size, pg_options) RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found!
At first, this mistake made me wonder if this GPU was useless, – – |, But the little partners in the lab are sure that GPU is OK! Then I started the bug troubleshooting journey
At this time, when viewing the command line, it finally shows its feet. It is estimated that there is a problem with pytorch, which is harmful!
>>> import torch >>> print(torch.cuda.is_available()) /home/xutianjiao/anaconda3/envs/py36/lib/python3.6/site-packages/torch/cuda/__init__.py:80: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 9020). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:112.) return torch._C._cuda_getDeviceCount() > 0 False >>> print(torch.cuda.get_device_name(0)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/xutianjiao/anaconda3/envs/py36/lib/python3.6/site-packages/torch/cuda/__init__.py", line 326, in get_device_name return get_device_properties(device).name File "/home/xutianjiao/anaconda3/envs/py36/lib/python3.6/site-packages/torch/cuda/__init__.py", line 356, in get_device_properties _lazy_init() # will define _get_device_properties File "/home/xutianjiao/anaconda3/envs/py36/lib/python3.6/site-packages/torch/cuda/__init__.py", line 214, in _lazy_init torch._C._cuda_init() RuntimeError: The NVIDIA driver on your system is too old (found version 9020). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
After checking this error, it shows that the versions of CUDA and torch do not match.
Check the version of pytorch, 1.10 +. OK, try installing a lower version of torch!
pip install torch==1.7.0
- [Solved] bushi RuntimeError: version_ ＜= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/s
- How to Solve Error: RuntimeError CUDA out of memory
- [Solved] mmdetection benchmark.py Error: RuntimeError: Distributed package doesn‘t have NCCL built in
- Pytorch ValueError: Expected more than 1 value per channel when training, got input size [1, 768
- [Solved] PyTorch Caught RuntimeError in DataLoader worker process 0和invalid argument 0: Sizes of tensors mus
- Python PIP TypeError: expected str, bytes or os.PathLike object, not int
- [Solved] RuntimeError: cublas runtime error : resource allocation failed at
- Pytorch: How to Handle error warning conda.gateways.disk.delete:unlink_or_rename_to_trash(140)
- Install PyTorch in Anaconda environment
- [Mac Pro M1] Python3.9 import cv2 Error: Reason: image not found
- How to Solve Python Importerror: DLL load failed: unable to find the specified program using tensorflow
- Pytorch: error message with chunks of 0 [How to Solve]
- RuntimeError: Address already in use [How to Solve]
- [Solved] socketio.exceptions.ConnectionError: OPEN packet not returned by server
- [Solved] bert_as_service startup error: Tensorflow 2.1.0 is not tested!
- [Solved] Python matplotlib Error: RuntimeError: In set_size: Could not set the fontsize…
- [Solved] python tqdm raise RuntimeError(“cannot join current thread“) RuntimeError: cannot join current thr
- Pytorch CUDA Error: UserWarning: CUDA initialization: CUDA unknown error…
- Pytorch directly creates a tensor on the GPU error [How to Solve]
- OSError libespeak.so.1 error: no such file or directory [How to Solve]