Debugging the Python code encountered this error
there is a similar error
CUDA error: cublas_ STATUS_ INTERNAL_ ERROR when calling cublasSgemm(...)
Network search, all kinds of answers, driver version, fixed CUDA device number and so on. Although all of them have been successful, they feel unreliable.
This error message looks like a memory access error
Check the code carefully and unify the data on CPU or GPU.
Inspection process is very troublesome, in order to facilitate inspection, I wrote a small function.
def printTensor(t, tag:str): sz = t.size() p = t for i in range(len(sz)-1): p = p if len(p)>3: p = p[:3] print('\t%s.size'%tag, t.size(), ' dev :', t.device, ": ",p.data) return
printtensor (context, 'context') , the output is similar
context.size torch.Size([4, 10, 10]) dev : cuda:0 : tensor([0, 0, 0], device=‘ cuda:0 ’)
This function has two main points
- output device output data
The second point is particularly important. Only output devices do not necessarily trigger errors. Only when you output data and pytorch runs down according to the process, can you make a real error.
Finally, the author found that the network of
NN. * did not call
to (device) explicitly. However, the customized models do inherit
NN. Module , which needs to be checked in the future.
- How to use torch.sum()
- Python: CUDA error: an illegal memory access was accounted for
- CheXNet-master: CUDA out of memery [How to Solve]
- ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256,
- Solution to unbalanced load of multiple cards (GPU’s 0 card is too high) in Python model training (simple and effective)
- Pytorch corresponding point multiplication and matrix multiplication
- FCOS No CUDA runtime is found, using CUDA_HOME=’/usr/local/cuda-10.0′
- Tensorflow 2.1.0 error resolution: failed call to cuinit: CUDA_ ERROR_ NO_ DEVICE: no CUDA-capable device is detected
- TypeError: __array__() takes 1 positional argument but 2 were given
- RuntimeError: log_vml_cpu not implemented for ‘Long’
- Tensorflow in function tf.Print Method of outputting intermediate value
- To solve the problem of increasing video memory when training network (torch)
- RuntimeError: each element in
- In tensorflow tf.reduce_ Mean function
- (Solved) pytorch error: RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED (install cuda)
- How to eliminate ADB error “more than one device and emulator”
- RuntimeError:cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/generic
- tf.one_ How to use hot ()
- torch.nn.BCELoss are unsafe to autocast [How to Solve]