An error is reported when testing pytorch multi card:
store = tcpstore (master_addr, master_port, world_size, start_daemon, timeout)
runtimeerror: address already in use
After investigation, there is another task running with DDP.
Solution:
manually specify an idle port
python -m torch.distributed.launch --master_port 145622
View port occupancy:
terminal input
netstat - nultp
Read More:
- RuntimeError: Address already in use [How to Solve]
- [Pytorch Error Solution] Pytorch distributed RuntimeError: Address already in use
- pytorch DDP Accelerate Error: [W reducer.cpp:362] Warning: Grad strides do not match bucket view strides.
- [Solved] Pymysql Use Error: RuntimeError: ‘cryptography‘ package is required for sha256_password
- [Solved] RuntimeError: Error(s) in loading state_dict for BertForTokenClassification
- [Solved] mmdetection benchmark.py Error: RuntimeError: Distributed package doesn‘t have NCCL built in
- [Solved] RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place
- [Solved] RuntimeError: NCCL error in: XXX, unhandled system error, NCCL version 2.7.8
- [Solved] RuntimeError: Error(s) in loading state_dict for Net:
- [Solved] PyTorch Caught RuntimeError in DataLoader worker process 0和invalid argument 0: Sizes of tensors mus
- Autograd error in Python: runtimeerror: grad can be implicitly created only for scalar outputs
- [Solved] Pytorch Error: RuntimeError: Error(s) in loading state_dict for Network: size mismatch
- [Solved] python tqdm raise RuntimeError(“cannot join current thread“) RuntimeError: cannot join current thr
- [Solved] RuntimeError: Error(s) in loading state dict for YOLOX:
- [Solved] Python matplotlib Error: RuntimeError: In set_size: Could not set the fontsize…
- pytorch RuntimeError: Error(s) in loading state_ Dict for dataparall… Import model error solution
- [Solved] RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation
- [Solved] PyTorch Load Model Error: Missing key(s) RuntimeError: Error(s) in loading state_dict for
- [Solved] RuntimeError: cublas runtime error : resource allocation failed at
- [Solved] RuntimeError : PyTorch was compiled without NumPy support