Problem
After training to a certain number of iterations, an error is reported:
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
Possible causes
- The shape dimension does not match
- Variables are not on the same device
- pytorch and cuda versions do not match
Solution
Add os.environ['CUDA_VISIBLE_DEVICES'] = '0'
at the beginning of the train.py file, and set device='cuda'
.
But there is a strange phenomenon: if you do not set the visible gpu, but specify device='cuda:0'
, it will also report an error.