Tag Archives: python

Vscode Tensorboard Error: We failed to start a TensorBoard session due to the following error: Command fa

When vscode opens the tensorboard, an error is reported:We failed to start a TensorBoard session due to the following error: Command failed: conda activate python && echo ‘e8b39361-0157-4923-80e1-22d70d46dee6’ && python /home/zhangyulan/.vscode-server/extensions/ms-python.python-2022.14.0/pythonFiles/printEnvVariables. py CommandNotFoundError: Your shell has not been properly configured to use ‘conda activate’. To initialize your shell, run $ conda init < SHELL_NAME> Currently supported shells are: – bash – fish – tcsh – xonsh – zsh – powershell See ‘conda init –help’ for more information and options. IMPORTANT: You may need to close and restart your shell after running ‘conda init’.

The main reason for the above problems is the version update.

Solution:

1. Make sure that in the .vscode-server/bin directory, delete the lock file xxxx-lock xxx, or not if it is not there. The file is shown in the following figure.

2. Return the python and balance extensions of vscode to 2022.14.0 and 2022.9.10, respectively, which are the versions one month ago. But I can’t go back to the version one month ago, just go back to the version one year ago

[Solved] ssl_client_socket_impl.cc handshake failed (Same Codes in Different Environments)

First of all, the same script environment (the same code, the same plug-in version) has no problem running on my native environment, windoiws11.

However! Report an error ssl_client_socket_impl.cc  handshake failed~ QaQ in the newly installed Windows 10 environment.

[19852:2032:0912/202419:ERROR:ssl_client_socket_impl.cc(983)] handshake failed;
 returned -1, SSL error code 1, net_error -100

I have added these two conditions, but the loop still reports an error and the script stops directly

options.add_argument('--ignore-certificate-errors')
options.add_argument('--ignore-ssl-errors')

There is no difference between the chromedriver version and the Chrome version.

I can’t find any other reason

[Solved] Pycharm Failed to Upload: Upload to *** failed. Could not list the contents of folder “sftp

Problem description:

Use pycharm to connect to the remote server. It has been working well before. Suddenly, there is a problem that cannot be uploaded. The problem prompt is as follows:

Upload to *** failed. Could not list the contents of folder “sftp:***”. (Timeout expired)

Analysis and Solution:

First, look at the sftp connection. It is found that it is normal

The problems may be caused by the lack of permissions on the remote server and the problem of path mapping. It is strange that there are no such problems after checking. I think the problem may case by the path.

In the path set before, Root path is / and Development path is the project path. Change it to Root path is the project path directly and Development path is /, then you can upload it normally.

[Solved] PyTorch Error: TypeError: exceptions must derive from BaseException

Project scenario:

PyTorch reports an error: TypeError: exceptions must deliver from BaseException


Problem description

In base_options.py, set the –netG parameters to be selected only from these.

self.parser.add_argument('--netG', type=str, default='p2hed', choices=['p2hed', 'refineD', 'p2hed_att'], help='selects model to use for netG')

However, when selecting netG, the code is written as follows:

def define_G(input_nc, output_nc, ngf, netG, n_downsample_global=3, n_blocks_global=9, n_local_enhancers=1, 
             n_blocks_local=3, norm='instance', gpu_ids=[]):    
    norm_layer = get_norm_layer(norm_type=norm)     
    if netG == 'p2hed':    
        netG = DDNet_p2hED(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, norm_layer)
    elif netG == 'refineDepth':
        netG = DDNet_RefineDepth(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, n_local_enhancers, n_blocks_local, norm_layer)
    elif netG == 'p2h_noatt':        
        netG = DDNet_p2hed_noatt(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, n_local_enhancers, n_blocks_local, norm_layer)
    else:
        raise('generator not implemented!')
    #print(netG)
    if len(gpu_ids) > 0:
        assert(torch.cuda.is_available())   
        netG.cuda(gpu_ids[0])
    netG.apply(weights_init)
    return netG

Cause analysis:

Note that there is no option of ‘rfineD’, so when running the code, the program cannot find the network that netG should select, so it reports an error.


Solution:

In fact, change the “elif netG==’refineDepth’:”  to “elif netG==’refineD’:”. it will be OK!

torchvision.dataset Failed to Download CIFAR10 Error [How to Solve]

An error occurred while using dataset to download the dataset

urllib.error.URLError:urlopen error unknown url type:https 

Considering that there is no import ssl, add the following command

**import ssl
ssl._create_default_https_context = ssl._create_unverified_context**

Run again to import ssl

import ssl report an error: DLL load fail error

Solution:

First, configure the environment variable, find the current python installation directory, and add the following three paths to the PATH of the system variable

**E:\Anaconda3\envs\pytorch;      #python.exe所在路径
  E:\Anaconda3\envs\pytorch\Scripts;		
  E:\Anaconda3\envs\pytorch\Library\bin**

Then find the files libcrypto-1_1.dll and libssl-1_1.dll in the bin folder and copy them to the DLLs path.

This solves the download problem

Jetson MONAILabel(arm) Failed to Run distributed Module [How to Solve]

Solution:

  • https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048
  • Switch the version to v1.11.0
  • Then install the distributed development package
  • sudo apt get install python3 pip libopenblas base libopenmpi dev libomp dev

Verification:

Cause analysis:

It is possible to call distributed modules in v1.11.0. Maybe there is something wrong with the official whl compilation in the new version

[Solved] PySide2 Failed to Create a Graphical Python Program

Error display

This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: direct2d, minimal, offscreen, windows.

Solution:

1. Install pyside2

pip install -U pyside2

2. Find Lib\site-packages\PySide2\__init__.py in the root directory of python

3. Add the following codes:

dirname = os.path.dirname(__file__)
plugin_path = os.path.join(dirname, 'plugins', 'platforms')
os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = plugin_path

Jupyter notebook Failed to Switch to the Virual Environment: DLL load failed python.exe could not find the entry

I. Error reporting

I originally installed Anaconda3, with python3.7 and Jupyter-notebook, and installed tf2.0 environment.

Later I created a new virtual environment tf_1 based on tf2.0 environment and installed tf.14, so that

tf1.0 and tf2.0 versions can be switched flexibly on Jupyter-notebook.

If you log in to notebook directly with cmd, see the following:

At this point, I directly create a new python3, which means that the default tf2.0 environment is Ok, as shown below:

But I new a tf_1_jjupyter is reported the following error, also open a tf2.0 version of the notebook file in the change kenerl will also report the same error.

The error is reported as follows.

ImportError: DLL load failed: The specified module could not be found

II. Solution

Solve the Jupyter notebook startup error or running code error

1. ImportError: DLL load failed : The specified module could not be found
Solution.

cmd-windows console-enter conda activate virtual environment name

For example, the name of the virtual environment here is tf_1

If you don’t remember, you can find it in the Anaconda installation directory

D:\software\Anaconda_candy\envs\tf_1

2. solve python.exe can not find the entrance can not locate the program input point
After entering the virtual environment if it still reports an error as follows.

This error pops up when I enter jupyter notebook, but I can enter jupyter notebook to debug the code normally when I cross it out. Initially, I think there is a problem with the dll file. After reading some online solutions, the following is the solution:

Solution: pythoncom37.dll is a file of pywin32 located in the path Anaconda3\envs\your virtual environment\Lib\site-packages\pywin32_system32, the location of my file here is shown below.

And there is a file with the same name ythoncom37.dll in D:\python\Anaconda3\envs\tf_1\Library\binp.

After deleting this file, there will be no pop-up error!

After deleting the pythoncom37.dl file according to the file path in the pop-up box, the error is still reported as follows.

[premise conda activate tf_1 under virtual environment

We follow the file path given in the pop-up box to find pythoncom37.dl and delete it again, and that’s the end of it.

The above solution has been successfully solved as follows.

Switching kernel in the file will also not report errors

Select tf2.0 and virtual environment tf1.0 in the drop-down box by creating a new, and you can switch versions freely, or switch environments in the current file, as shown below:

[Solved] PyTorch Lightning Error: KeyError: ‘hidden_states‘

How to Solve PyTorch Lightning error KeyError: ‘hidden_ states’

Problem description: PyTorch Lightning error: KeyError: ‘hidden_ states’.

model = BertModel.from_pretrained('bert-base-uncased')

Solution: add a parameter after the above code, config=BertConfig.from_pretrained(‘bert-base-uncased’,output_hidden_states=True), as below:

model = BertModel.from_pretrained('bert-base-uncased', config=BertConfig.from_pretrained('bert-base-uncased',output_hidden_states=True))

[Solved] RuntimeError: NCCL error in: XXX, unhandled system error, NCCL version 2.7.8

Project scenario:

This problem is encountered in distributed training,


Problem description

Perhaps parallel operation is not started???(


Solution:

(1) First, check the server GPU related information. Enter the pytorch terminal to enter the code

python
torch.cuda.is_available()# to see if cuda is available.
torch.cuda.device_count()# to see the number of gpu's.
torch.cuda.get_device_name(0)# to see the gpu name, the device index starts from 0 by default.
torch.cuda.current_device()# return the current device index.

Ctrl+Z Exit
(2) cd enters the upper folder of the file to be run

 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6 #启动并行运算

Plus files to run and related configurations

 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python -m torch.distributed.launch --nproc_per_node=6  src_nq/create_examples.py --vocab_file ./bert-base-uncased-vocab.txt \--input_pattern "./natural_questions/v1.0/train/nq-train-*.jsonl.gz" \--output_dir ./natural_questions/nq_0.03/\--do_lower_case \--num_threads 24 --include_unknowns 0.03 --max_seq_length 512 --doc_stride 128

Problem-solving!

[Solved] RuntimeError: Error(s) in loading state dict for YOLOX:

After training the model, an error occurs when running the demo.py inference file in YOLOX, and the running code with the error is as follows:

Run Code

python tools/demo.py image -f exps/example/yolox_voc/yolox_voc_s.py -c YOLO_outputs/yolox_voc_s_1/best_ckpt.pth  --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]

Note:

 -f exps/example/yolox_voc/yolox_voc_s.py

This command must match, not the yolox used for testing before training_s.py, which is configured by yourself. If you don’t correct it, you will always report the following errors.

Of course, if the above instructions are OK, this error still occurs, that is, the category corresponding error in the demo.

Take my own example, I use VOC format datasets, but the default in the demo file is COCO_CLASSES, so this will definitely report an error, so we have to change it in the demo.py file.

First, find the file yolox/data/datasets/_init_.py and add the following code to the file.

from .voc_classes import VOC_CLASSES

Then enter tools/demo.py file

About 15 lines, Modify

from yolox.data.datasets import COCO_CLASSES

to

from yolox.data.datasets import VOC_CLASSES

Modify about 100 lines of cls_names in Predictor:

to

Set the function of about 300 lines

Change to

No error will be reported during operation, successful! NICE!