Problem: error polling for event status: failed to query event: CUDA_error_launch_failed: unspecified launch failure
Troubleshooting: in the past, when my computer was in-depth learning, it automatically quit the program after training several epochs at a time. It can’t continue training. It can’t be a problem with the code, because the code can run directly on Ubuntu. Some people say that the video memory of the graphics card is insufficient, but sometimes it can train the whole network, query the GPU memory and find that the memory is not used during training.
Solution: through consulting the data, it is found that it may be the problem of the graphics card version. The driver version of my computer’s graphics card was 457 before, but there was no such problem after it was upgraded to 471.