Perfect solution to raise runtimeerror (“distributed package doesn’t have nccl”) in Windows system“

The following issues arise during training.
File “C:\Users\urser\anaconda3\lib\site-packages\torch\distributed\distributed_c10d.py”, line 597, in _new_process_group_helper
raise RuntimeError(“Distributed package doesn’t have NCCL ”
RuntimeError: Distributed package doesn’t have NCCL built in
From the text, the error message is obvious, there is no NCCL
and windows does not support NCCL backend.
Let’s look at the official documentation.
As of PyTorch v1.8, Windows supports all collective communications backend but NCCL, If the init_method argument of init_process_group() points to a file it must adhere to the following schema:
And to solve this problem is also very simple, do not use NCCL backend.
Only one line of code is needed to solve the problem.


Read More: