Author Archives: Robins

[Solved] Memory analysis tool Start Error: An internal error occurred during: “Parsing heap dump from

1. Error when opening the file

II. Reason
if the size of dump file is larger than your configured 1024m, the above error will be reported

III. Solution
1. Open MemoryAnalyzer.ini file in the directory of MAT.

2. The default is 1024, which can be modified to open the file

3. After modification, restart the mat tool and reopen it.

[Solved] AttributeError: module ‘PIL.Image‘ has no attribute ‘open‘

AttributeError: module ‘PIL. Image’ has no attribute ‘open’. It means PIL.image does not has an open method. I have searched lots of solutions online, but they are not work. Finally, I inadvertently saw the address of image.py (c:\users\lenovo\pycharmprojects\kk\venv\lib\site packages\pil\image.py). I know the reason of the error.

from PIL import Image
import os
import csv
import time

Reason: the image.py file under the PIL package was accidentally emptied, so image.open() cannot be realized.

temp_img_now = Image.open(temp_file)

Solution: uninstall the pilot and pillow-PIL, and then reinstall them.

[Solved] selenium.common.exceptions.WebDriverException: Message: unknown error: DevToolsActivePort file doesn

1. Phenomenon: Jenkins cannot call out the browser page when building selenium, and Jenkins reports an error when building selenium: selenium.common.exceptions WebDriverException: Message: unknown error: DevToolsActivePort file doesn’t exist

2. Troubleshooting: run the project code directly under the Jenkins working directory, call out the browser normally, and the use case execution is completed

3. Reason: after Jenkins is built, the default node for running automation cases is the master, and the processes running on the master are background processes, so you can’t see the browser running interface

4. Solution:

① build the project in a graph free way
② add a slave node to Jenkins and point the project construction node to slave

[Solved] yolov5-6.0 ERROR: AttributeError: ‘Upsample‘ object has no attribute ‘recompute_scale_factor‘

Preface: using yolov5-6.0 version, you want to detect several pictures, but there is a problem in the title. It can be seen that the upsampling function is not quite right. Now record the solution.

Version: yolov5-6.0, python3.8, pytorch1.11.0

1. Problem recurrence

2. Official website solution

This problem first appeared in yolov5 and is related to pytoch 1.11.0.

In other words, this problem may be encountered in both train and detect. The following is the solution to reduce the pytoch version to less than 10.

Then the blogger made a fix for PyTorch version 1.11.0

But it doesn’t seem to be solved. The my torch version is 1.11.0, but this problem still occurs.

The solution of modifying the upper sampling function given by netizens.

Just Comment out this part below.

It’s really solved.

[Solved] error while loading shared libraries: libjson.so.0: cannot open shared object file: No such file or

error while loading shared libraries: libjson.so.0: cannot open shared object file: No such file or

Solution:

sudo vim  /etc/ld.so.conf

Add /usr/local/lib to the file, which is the path is libjson.so.0 located

include /etc/ld.so.conf.d/*.conf 
/usr/local/lib

Finally, the terminal inputs sudo ldconfig

[Solved] with ERRTYPE = cudaError CUDA failure 999 unknown error

Project scenario [with errtype = cudaerror; bool thrw = true] CUDA failure 999: unknown error; GPU=24 :

The old program needs to be upgraded. The previous CUDA is 10.2


Problem Description:

environment

CUDA 11.2 (previously 10.2)

onnxruntime-gpu 1.10

python 3.9.7

When starting the program

Traceback (most recent call last):
  File "/home/aiuser/cover/liheng-foggun/app.py", line 15, in <module>
    model = DetectMultiBackend(weights=config.paddle.model_file)
  File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/aiuser/cover/liheng-foggun/models/yolo.py", line 37, in __init__
    self.session = onnxruntime.InferenceSession(weights, providers=['CUDAExecutionProvider'])
  File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/aiuser/miniconda3/envs/cover/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 379, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:122 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE =
 cudaError; bool THRW = true] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*
, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 999: unknown error ; GPU=24 ; hostname=aiserver-sl-01 ; expr=cudaSetDevice(info_.device_id);

Cause analysis:

1. At first, I thought it was the onnxruntime GPU version problem, upgraded to 1.12 it still reports an error.

2. It is said that it is incompatible.

3. Try to reinstall the driver. When 11.2 is uninstalled, nvidia-smi finds that the previous 10.2 driver still exists.

4. The reason is that the previous drive was not unloaded completely


Solution:

1. Uninstall 10.2

sudo /usr/local/cuda-10.2/bin/cuda-uninstaller

2. Install a new drive

#install 515.57 offline
sudo ./NVIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check

VIDIA-Linux-x86_64-515.57.run -no-x-check -no-nouveau-check

[Solved] Error: error:0308010C:digital envelope routines::unsupported

Error: error:0308010C:digital envelope routines::unsupported

npm run serve error: error:0308010C

Error: error:0308010C:digital envelope routines::unsupported
    at new Hash (node:internal/crypto/hash:71:19)
    at Object.createHash (node:crypto:133:10)
    at module.exports (D:\Item\springbootVue\springboot\vue\node_modules\webpack\lib\util\createHash.js:135:53)
    at NormalModule._initBuildHash (D:\Item\springbootVue\springboot\vue\node_modules\webpack\lib\NormalModule.js:417:16)
    at handleParseError (D:\Item\springbootVue\springboot\vue\node_modules\webpack\lib\NormalModule.js:471:10)
    at D:\Item\springbootVue\springboot\vue\node_modules\webpack\lib\NormalModule.js:503:5

Solution:
set NODE_OPTIONS=–openssl-legacy-provider

D:\Item\springboot\vue>set NODE_OPTIONS=--openssl-legacy-provider

D:\Item\springboot\vue>npm run serve

> [email protected] serve
> vue-cli-service serve

It can run successfully

The main reason is the incompatibility of versions

[Solved] adb shell error: error: device unauthorized

2022/7/29 oppo-r11s Android 8-test success

After connecting the Android device, the windows computer wants to enter the device through the terminal command line, and an error is reported

Error content

C:\Users> adb shell

error: device unauthorized.
This adb server's $ADB_VENDOR_KEYS is not set
Try 'adb kill-server' if that seems wrong.
Otherwise check for a confirmation dialog on your device.

Inspection items:

1. Whether the data cable is plugged firmly

2. Whether the developer option is turned on

3. Whether the USB debugging option is turned on

If the above inspections are normal, the following methods can be used to solve the problem:

On the command line, enter adb kill-server to close the ADB service, and then ADB devices. He will automatically start the service. After querying the device, enter ADB shell again to test successfully.

If you fail unfortunately, you can enter adb start-server to restart the service that has just been shut down.

adb kill-server

[Solved] RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Background:

Use a graphics card in the ubuntu18.04 system geforce RTX 3090 to reproduce r2c


problem

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Cause analysis:

The graphics card geforce RTX 3090 only supports versions of cuda11 and above.


Solution:

Update pytorch and CUDA versions:

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

[Solved] Operator Not Allowed In Graph Error & Attribute Error Tensor object has no attribute numpy

The reason for the above error when compiling custom functions is that tf2.x’s keras.compile does not support specific values by default

Questions

When using the wrapping method to customize the loss function of the keras model and need to calculate accuracy metrics such as precision or recall, or need to extract the specific values of the inputs y_true and y_prd (operations such as y_true.numpy()), an error message appears:

OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.

Or

 AttributeError: 'Tensor' object has no attribute 'numpy'

 

Solution:

Pass in parameters in the compile function:

run_eagerly=True

 

Reason:

Tf2.x enables eager mode by default, namely eager execution, that is, dynamic calculation graph. Compared with the static calculation graph of tf1.x, the advantage of eager mode is that it is convenient for debugging, which can easily print tensor values ​​and evaluate results; and Numpy interacts well, and the conversion between tensor and ndarray is convenient and even universal. The tradeoff is that it runs significantly slower. After the static calculation graph is defined, it is almost always executed with C++ code on the tensorflow core, so the calculation efficiency is higher and the speed is faster.

Even so, run_eagerly defaults to False in the model.compile method, which means that the logic of the model is encapsulated in tf.function, which achieves faster computational efficiency (the autograph mechanism converts the dynamic computational graph through the @tf.function wrapper). is a static computation graph). But the @tf.function wrapper requires the function to use basic tf operations, not other operations in python or even functions from other packages, so the first error occurs when calling functions such as sklearn.metrics’ accuracy_score or imblearn.metrcis’ geometric_mean_score function. The second error occurs when using the y_true.numpy() method. The fundamental reason is that the model.compile method does not support the above operations after the static calculation graph converted by the @tf.function wrapper, although tf2.x enables the use of dynamic calculation graphs by default.

After passing run_eagerly=True to the model.compile method, the dynamic calculation graph is used to run, and the above operations can be performed normally. The disadvantage is that the dynamic calculation graph has the disadvantage of low operation efficiency.

[Solved] ProxyError: Conda cannot proceed due to an error in your proxy configuration.

ProxyError: Conda cannot proceed due to an error in your proxy configuration.

0. Problem reporting error

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

The following errors are reported when installing pytorch:

ProxyError: Conda cannot proceed due to an error in your proxy configuration.
Check for typos and other configuration errors in any '.netrc' file in your home directory,
any environment variables ending in '_PROXY', and any other system-wide proxy
configuration settings.

The problem lies in agency

1. Solutions

(1) View current terminal agent

env | grep -i "_PROXY"

(2) Delete agents in turn

unset HTTP_PROXY
unset https_proxy
unset http_proxy
unset no_proxy
unset NO_PROXY

3. Whether the verification is successful

Enter again

env | grep -i "_PROXY"

then enter the following:
env. grep -i "PROXY"