Deep learning model error + 1: CUDA error: device side assert triggered

Scenario:
some time ago, when running the fast RCNN model in Google’s colab, there was no problem. Later, when using featurize to rent a server to run the model, the same code kept reporting the error “CUDA error: device side assert triggered”
these two days have driven me crazy. There are many blog articles about this situation on the Internet. Most of them say that the label is out of bounds, and some of them have problems in the calculation of loss function
I can only debug step by step, and I’d better solve my own problems.

'''When running with GPU, this function reports an error “CUDA error: device-side assert triggered”'''
perm1 = torch.randperm(positive.numel(), device=positive.device)[:num_pos]
perm2 = torch.randperm(negative.numel(), device=negative.device)[:num_neg]

'''After modification, change device to cpu'''
perm1 = torch.randperm(positive.numel(), device="cpu")[:num_pos]
perm2 = torch.randperm(negative.numel(), device="cpu")[:num_neg]

Make a record, hoping to help people in the same situation.


Read More: