Question
The single card training is very fast. When it comes to eval, it doesn’t move after running a batch, and there is no error.
Tried, still not moving
1, change the pin_memory of valid_loader to False. if it is True, it will automatically load the data into pin_memory, which speeds up the data
transfer speed to GPU.
2, change num_workers to 1, some people say too many workers may lead to multi-process interlock, can reduce or not
Final Solution:
valid_loader:
pin_memory = true # this is very important. Before, people on the Internet said that changing false might solve the problem. My experiment proved that if you do not work, you can run normally by changing back to true.
num_workers=4
batch_size=8
train_loader:
pin_memory=True
num_workers=4
batch_size = 8
these parameters are the same as valid_loader
In general, first of all, the pin_memory of valid_loader is kept True, which is well understood, the data is automatically loaded into pin_memory, which speeds up the data transfer to the GPU and naturally speeds up the inference process. Then, the number of workers and batch_size is reduced, and both valid_loader and train_loader are reduced. pin_memory of train_loader is also kept True.
Read More:
- How to Solve Pytorch DataLoader Loading Error: UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xe5 in position 1023
- [Solved] Pytorch Install Error: Solving environment: failed with initial frozen solve. Retrying with flexible solve
- Pytorch: How to Handle error warning conda.gateways.disk.delete:unlink_or_rename_to_trash(140)
- Pytorch Run Error: BrokenPipeError [How to Solve]
- [Solved] pytorch Error: RuntimeError: Unable to find a valid cuDNN algorithm to run convolution
- Solve pytorch multiprocess valueerror: error initializing torch.distributed using env: //rendezvou… Error
- How to Solve paddleOCR recognition of curved text Error
- How to Solve Error: RuntimeError CUDA out of memory
- PyCharm: How to Solve Tensorflow_datasets Import Error
- How to Solve Automatic error keyerror:***‘
- How to Solve wikiextractor Extract Wikipedia Corpus Error
- How to Solve Yolox Training C Disk Full Issue
- How to Solve Python Xlwt ValueError: More than 4094 XFs (styles)
- How to Solve Python3.9 Install pycrypto Error
- [Solved] Pytorch Download CIFAR1 Datas Error: urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certi
- How to Solve cv2.applyColorMap Error
- How to Solve ModuleNotFoundError Error After pip-autoremove Installed
- How to Solve PyInstaller Package Error: ModuleNotFoundError: No module named ‘xxxx‘
- Python: How to Solve multiprocessing module Error in Windows
- How to Solve pycharm terminal Failed to switch to virtual environment