[Solved] PyTorch Caught RuntimeError in DataLoader worker process 0和invalid argument 0: Sizes of tensors mus

The error is as follows:

Traceback (most recent call last):
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 75, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 75, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in elem}
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 65, in default_collate
    return default_collate([torch.as_tensor(b) for b in batch])
  File "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 8 and 16 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

In __ getitem__ function does get the data, so the problem lies in torch. Utils. Data. Dataloader

analysis

In fact, there are two mistakes

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 8 and 16 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

Prompt for inconsistent data dimensions, jump toFile "/home/jiang/miniconda3/envs/Net/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate return torch.stack(batch, 0, out=out) Source file at :

  if isinstance(elem, torch.Tensor):
   out = None
   if torch.utils.data.get_worker_info() is not None:
       # If we're in a background process, concatenate directly into a
       # shared memory tensor to avoid an extra copy
       numel = sum([x.numel() for x in batch])
       storage = elem.storage()._new_shared(numel)
       out = elem.new(storage)
   return torch.stack(batch, 0, out=out)

It can be found that the dataloader needs to merge at the end. If the batchsize is set, then this is the process of batch merging. If the dimensions are not unified, an error will be reported.

Another error is to enable multi threading (Num)_ workers!= 0) prompt which thread has a problem. Because the dimensions of batch merge are different, the first thread will hang (worker process 0), so runtimeerror: caught runtimeerror in dataloader worker process 0. will be prompted

Solution:

Since the dimensions are not unified, it’s good to ensure that the dimensions are the same. You can set a large enough array or tent in advance, and mark the unfilled part. When you read the data, you can determine the valid data according to the mark.


Read More: