Tag Archives: Deep learning

Pytorch CUDA Error: UserWarning: CUDA initialization: CUDA unknown error…

After CUDA is installed, the following error is reported using pytorch

UserWarning: CUDA initialization: CUDA unknown error – this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_ VISIBLE_ DEVICES after program start.

Solution: after CUDA and pytorch are installed, add the following in. Bashrc

export  PATH=/usr/local/cuda-11.4/bin:$PATH
export  LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-11.4/bin

If there is still a problem, use sudo apt-get install NVIDIA modprobe to install it. After the installation, you can use it

Methods of checking CUDA

import torch
flag = torch.cuda.is_available()

Output is: True cuda normal

Error in training yolox: error in importing apex

This error is reported because you did not successfully install apex. Note ~: it is not PIP install apex

ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-6o4wusvf/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-6o4wusvf/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /tmp/pip-record-07hjl8r1/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.7/apex Check the logs for full command output.
Exception information:
Traceback (most recent call last):

  Processing method: Step 1: use this command to check the CUDA version supported by your machine:

nvcc --version

Step 2: use the following command to view the version of CUDA you currently have installed.

pip list

Note: when installing apex, you must ensure that the two versions are consistent. That is, if the version supported by the machine is 11.0, you can install CUDA of the corresponding torch version.

First use the following command to uninstall the original torch of your machine

!pip uninstall -y torch torchvision torchaudio

Then use the following command to install. For example, the machine here supports CUDA version 11.0

!pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Next, install apex:

!git clone https://github.com/NVIDIA/apex
%cd apex
!pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

The above work is completed.

[Solved] RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cubla…

Resolve runtime error: CUDA error: cublas_STATUS_EXECUTION_FAILED when calling `cubla…

The running experiment encountered this problem. At the beginning, it was found that some people said it was because the dimensions might be different, but after inspection, this problem did not exist.

Another solution is to add a sentence of code

torch.backends.cudnn.enabled = false , but I haven’t tried yet, because it is found that the CUDA device settings of the main.py file and other files are different (there is not much data, I didn’t set nn.dataparallel, so there will be no problem after the changes are consistent.

Therefore, if you encounter this problem, you can check whether each variable and model are on the same CUDA device.

[Solved] RuntimeError: CUDA error: invalid device ordinal

Error Message:
RuntimeError: CUDA error: invalid device ordinal


args.device = torch.device('cuda:' + str(args.gpu_id))

Used the gpu_id exceeds that of the GPU card

My code is written in the above way. Different codes are written in different ways. In the final analysis, just change the number after CUDA: to be appropriate

RuntimeError: CUDA error: device-side assert triggered


The reason for the error is that when calculating the loss function in pytorch, the tag is (batch, height, width). If the category is 10, the value should be 0 ~ 9, that is:
0 & lt= value<= C-1, where C is the number of channels or categories

terms of settlement

My category is 10, and the value is 1 ~ 10, so you only need to subtract 1, as shown below.

c_loss = nn.CrossEntropyLoss()
labels_v = labels_v-1 
loss0 = c_loss(d0, labels_v.long())


This is mainly because the tag data of your training data may exceed the number of tags set in the configuration file. Or the number of tags in the validation set exceeds the number of tags in the training set.

[Solved] D2lzh_Pytorch Import error: importerror: DLL load failed while importing

Import d2lzh_Pytorch reports an error, importerror: DLL load failed while importing_ Torchtext: the specified program cannot be found.!! OMG

Guide Package

import torch
import torch.nn as nn
import numpy as np
import pandas as pd
import sys
import d2lzh_pytorch as d2l

The error is as follows:

ImportError: DLL load failed while importing _torchtext: The specified program could not be found.

The solution is as follows:

#check torch version
import torch
print(torch.__version__) #1.7.0

#Download the torchtext that corresponds to your own torch version
pip install torchtext==0.6

Perfect solution to the problem

Pytorch directly creates a tensor on the GPU error [How to Solve]

Pytoch directly creates tensors on the GPU and reports an error: Legacy constructor expectations device type: cpubut device type: CUDA was passed

General tensor creation method:


However, by default, the tensor is placed in the CPU (memory). If we want to use the GPU to train the model, we must also copy the tensor to the GPU, which will obviously save time
I’ve seen other articles before saying that tensors can be created directly on the GPU, so I’ve also made a try:

x = torch.LongTensor(x, device=MyDevice)

An error is reported when running the program:

legacy constructor expects device type: cpu but device type: cuda was passed

According to the error report analysis, the problem is that the device parameter cannot be passed ‘CUDA’?After checking, I found that the official answer given by pytorch is that tensor class is a subclass of tensor and cannot pass parameters to its device. There is no problem using the tensor class to build
⭐ Therefore, the code is changed as follows:

MyDevice = torch.device('cuda:0')
x = torch.tensor(x, device=MyDevice)
x = x.long()

Now, there will be no more errors.

[ONNXRuntimeError] : 10 : INVALID_Graph loading model error

Project scenario:

The python model is converted to onnx model and can be exported successfully, but the following errors occur when loading the model using onnxruntime

InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_ GRAPH : Load model from T.onnx failed:This is an invalid model. Type Error: Type ‘tensor(bool)’ of input parameter (8) of operator (ScatterND) in node (ScatterND_ 15) is invalid.

Problem Description:

import torch
import torch.nn as nn
import onnxruntime
from torch.onnx import export

class Preprocess(nn.Module):
    def __init__(self):
        self.max = 1000
        self.min = -44

    def forward(self, inputs):
        inputs[inputs>self.max] = self.max
        inputs[inputs<self.min] = self.min
        return inputs
x = torch.randint(-1024,3071,(1,1,28,28))
model = Preprocess()


session = onnxruntime.InferenceSession("test.onnx")

Cause analysis:

The same problem can be found in GitHub of pytorch #34054


The specific operations are as follows: Mr. Cheng mask , and then use torch.masked_ Fill() operation. Instead of using the index to directly assign the entered tensor

class MaskHigh(nn.Module):
    def __init__(self, val):
        self.val = val

    def forward(self, inputs):
        x = inputs.clone()
        mask = x > self.val
        output = torch.masked_fill(inputs, mask, self.val)
        return output

class MaskLow(nn.Module):
    def __init__(self, val):
        self.val = val

    def forward(self, inputs):
        x = inputs.clone()
        mask = x < self.val
        output = torch.masked_fill(inputs, mask, self.val)
        return output

class Clip(nn.Module):
    def __init__(self):
        self.high = MaskHigh(1300)
        self.low = MaskLow(-44)

    def forward(self, inputs):
        output = self.high(inputs)
        output = self.low(output)
        return output

Netron can be used to visualize the calculation diagram generated by the front and rear methods

Index assignment

[Solved] YOLOv4 Error: Layer before convolutional layer must output image.: No error


Recently, when learning yolo4 and running your own data set with yolo4, I found that the
training set layer before revolutionary layer must output image.: no error.


1. Solution

Check the customized cfg file. The size of the input image is set as follows

if both height and width are set to a multiple of 32, this problem will not occur. I set it here as 416, 416

2. Follow up questions

Pay attention to setting size to the size of the picture in your dataset, otherwise, you may not be able to open the picture. The error is as follows

Can’t load image xxxxxxxxxxxxxxxxxx