The error is as follows:
Traceback (most recent call last):
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/tqdm/std.py", line 1178, in __iter__
for obj in iterable:
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 75, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 75, in <dictcomp>
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 65, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 8 and 16 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689
In __ getitem__
function does get the data, so the problem lies in torch. Utils. Data. Dataloader
analysis
In fact, there are two mistakes
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 8 and 16 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689
Prompt for inconsistent data dimensions, jump toFile "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate return torch.stack(batch, 0, out=out)
Source file at :
if isinstance(elem, torch.Tensor):
out = None
if torch.utils.data.get_worker_info() is not None:
# If we're in a background process, concatenate directly into a
# shared memory tensor to avoid an extra copy
numel = sum([x.numel() for x in batch])
storage = elem.storage()._new_shared(numel)
out = elem.new(storage)
return torch.stack(batch, 0, out=out)
It can be found that the dataloader needs to merge at the end. If the batchsize is set, then this is the process of batch merging. If the dimensions are not unified, an error will be reported.
Another error is to enable multi threading (Num)_ workers!= 0) prompt which thread has a problem. Because the dimensions of batch merge are different, the first thread will hang (worker process 0), so runtimeerror: caught runtimeerror in dataloader worker process 0.
will be prompted
Solution:
Since the dimensions are not unified, it’s good to ensure that the dimensions are the same. You can set a large enough array or tent in advance, and mark the unfilled part. When you read the data, you can determine the valid data according to the mark.
Read More:
- [Solved] Yolov5 Deep Learning Error: RuntimeError: DataLoader worker (pid(s) 2516, 1768) exited unexpectedly
- How to Solve Pytorch DataLoader Loading Error: UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xe5 in position 1023
- [Solved] pytorch loss.backward() Error: RuntimeError: Function AddBackward0 returned an invalid gradient at index 1…
- [Solved] bushi RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/s
- [Solved] Pytorch Error: RuntimeError: Error(s) in loading state_dict for Network: size mismatch
- [Pytorch Error Solution] Pytorch distributed RuntimeError: Address already in use
- pytorch: RuntimeError CUDA error device-side assert triggered
- pytorch RuntimeError: Error(s) in loading state_ Dict for dataparall… Import model error solution
- Pytorch torch.cuda.FloatTensor Error: RuntimeError: one of the variables needed for gradient computation has…
- How to Solve RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu
- [Solved] RuntimeError : PyTorch was compiled without NumPy support
- Pytorch ValueError: Expected more than 1 value per channel when training, got input size [1, 768
- [Solved] Pytorch Error: RuntimeError: expected scalar type Double but found Float
- [Solved] PyTorch Load Model Error: Missing key(s) RuntimeError: Error(s) in loading state_dict for
- Pytorch Error: RuntimeError: value cannot be converted to type float without overflow: (0.00655336,-0.00
- [Solved] pytorch Error: RuntimeError: Unable to find a valid cuDNN algorithm to run convolution
- [Solved] Pytorch error: RuntimeError: one of the variables needed for gradient computation
- Pytorch Error: runtimeerror: expected scalar type double but found float
- [Solved] mmdetection benchmark.py Error: RuntimeError: Distributed package doesn‘t have NCCL built in
- [Solved] ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memor