Tag Archives: Deep learning

Record a problem of no module named ‘tensorflow. Examples’ and’ tensorflow. Examples. Tutorials’ in tensorflow 2.0

1: No module named ‘tensorflow. Examples’
I downloaded tensorflow directly from the Internet, which is version 2.5. The path to add examples is in C:// program data/anaconda3/envs/tensorflow/lib/site packages/tensorflow, which is similar to that on the Internet_ In the core folder, there is no such folder in version 2.5, so all the next operations are performed in site package/tensorflow.

First of all, you have to go to the official website of tensorflow( https://github.com/tensorflow/tensorflow/tree/master/tensorflow )Download the examples folder and copy it to the site package/tensorflow folder mentioned above. If you continue to run your code, there will be a problem of no module named ‘tensorflow. Examples. Tutorials’.

2: No module named ‘tensorflow. Examples. Tutorials’
in the site package/tensorflow folder, click the examples file you just copied in (I believe you have downloaded many tutorials files on the Internet, just copy them in directly), and then the code can run

Note: if you have not downloaded to the tutorials file, you can go to the official website of tensorflow, and then adjust the version to the version before 2.40, you will find the tutorials file in the examples folder (this method has not been tested, if it is feasible after the test, you can leave a message in the comments area, thank you).

File C:\Users\admin\Documents\WindowsPowerShell\profile.ps1 cannot be loaded because running scripts

CONDA activate error

File C:\Users\admin\Documents\WindowsPowerShell\profile.ps1 cannot be loaded because running scripts is disabled on this system.

resolvent

Refer to the answer of stackoverflow
the steps are as follows:

    open the PowerShell in administrator mode. If you don’t know how to open it as an administrator, you can press win + R , then enter shell , and then press Ctrl + shift + Enter . typing

    Set-ExecutionPolicy RemoteSigned
    

This should allow your system to run CONDA scripts. If you want to set it back, type it in the PowerShell administrator mode

Set-ExecutionPolicy Restricted

[Solved] RuntimeError: cuda runtime error: device-side assert trigger

In this way, when running fastercnn, we need to change the original model’s 21 categories to our own number of categories. After the first modification, no error will be reported in the run, and after the second modification, an error will be reported as follows:
1 block: [0,0,0], thread: [16,0,0] assertion T & gt= 0 && amp; t < n_ Classes failed.
2 runtime error: CUDA runtime error (59): device side assert triggered
the main solutions on the Internet are as follows:

The reason for this problem is that there are tags in the training data that exceed the number of categories. For example, I set up a total of 8 classes, but if 9 appears in the tag in the training data, this error will be reported. So here’s the problem. There’s a trap. If the tag in the training data contains 0, the above error will also be reported. This is very weird. Generally, we start counting from 0, but in Python, the category labels below 0 have to report an error. So if the category label starts from 0, add 1 to all category labels.

Solution:
The first time I ran the program, I found that there were 16 categories (I deleted 4 categories, but I didn’t find them). After running the program, I found that there were four more categories, so I deleted these four categories. However, when I ran the program again, I reported the above error. The reason is that every time we
run the program, we have to delete the cache generated by the last run, because I didn’t delete it, so the program thought it was 16 categories, But only 12 categories are provided. So if you report this error, you can delete the cache and run it again

RuntimeError: each element in

Runtimeerror: each element in list of batch should be of equal size
define your own dataset class, return the corresponding data to be returned, and find the following error

RuntimeError: each element in list of batch should be of equal size

Baidu said that the most direct way is to batch it_ The value of size is changed to 1, and the error report is released. But I’m training models, not just to correct mistakes. batch_ How to train the model when size is set to 1, so I decided to study this error.

Original Traceback (most recent call last):
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 83, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/cv/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 81, in default_collate
    raise RuntimeError('each element in list of batch should be of equal size')

According to the error information, you can find the source of the error. Py source code, the error appears in the default_ In the collate() function. Baidu found this source defaul_ The collate function is the default batch processing method of the dataloader class. If the collate function is not used when defining the dataloader_ If the FN parameter specifies a function, the method in the following source code will be called by default. If you have the above error, it should be the last four line error in this function

def default_collate(batch):
    r"""Puts each data field into a tensor with outer dimension batch size"""

    elem = batch[0]
    elem_type = type(elem)
    if isinstance(elem, torch.Tensor):
        out = None
        if torch.utils.data.get_worker_info() is not None:
            # If we're in a background process, concatenate directly into a
            # shared memory tensor to avoid an extra copy
            numel = sum([x.numel() for x in batch])
            storage = elem.storage()._new_shared(numel)
            out = elem.new(storage)
        return torch.stack(batch, 0, out=out)
    elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
            and elem_type.__name__ != 'string_':
        if elem_type.__name__ == 'ndarray' or elem_type.__name__ == 'memmap':
            # array of string classes and object
            if np_str_obj_array_pattern.search(elem.dtype.str) is not None:
                raise TypeError(default_collate_err_msg_format.format(elem.dtype))

            return default_collate([torch.as_tensor(b) for b in batch])
        elif elem.shape == ():  # scalars
            return torch.as_tensor(batch)
    elif isinstance(elem, float):
        return torch.tensor(batch, dtype=torch.float64)
    elif isinstance(elem, int_classes):
        return torch.tensor(batch)
    elif isinstance(elem, string_classes):
        return batch
    elif isinstance(elem, container_abcs.Mapping):
        return {key: default_collate([d[key] for d in batch]) for key in elem}
    elif isinstance(elem, tuple) and hasattr(elem, '_fields'):  # namedtuple
        return elem_type(*(default_collate(samples) for samples in zip(*batch)))
    elif isinstance(elem, container_abcs.Sequence):
        # check to make sure that the elements in batch have consistent size
        it = iter(batch)
        elem_size = len(next(it))
        if not all(len(elem) == elem_size for elem in it):
            raise RuntimeError('each element in list of batch should be of equal size')
        transposed = zip(*batch)
        return [default_collate(samples) for samples in transposed]

    raise TypeError(default_collate_err_msg_format.format(elem_type))

This function is to pass in a batch data tuple, in which each data is in the dataset class you defined__ getitem__() Method. The length of the tuple is your batch_ Size sets the size of the. However, one of the fields of the iteratable object returned by the dataloader class is the batch_ The corresponding fields of size samples are spliced together. Therefore, when this method is called by default, it will enter the penultimate line for the first time return [default]_ Collate (samples) for samples in translated] use the zip function to generate an iterative object from the batch tuple. Then the same field is retrieved by iteration and the default is recursively re passed in_ In the collate() function, take out the first field to judge whether the data type is in the type listed above, then the dateset content can be returned correctly
if the batch data is processed in the above order, the above error will not occur. If the data of the element is not in the listed data type after the second recursion, it will still enter the next recursion, that is, the third recursion. At this time, even if the data can be returned normally, it does not meet our requirements, and the error report generally appears after the third recursion. Therefore, if you want to solve this error, you need to carefully check the data type of the return field of your defined dataset class. It can also be found in defaule_ In the collate() method, output the batch content before and after processing, and view the specific processing flow of the function to help you find the error of the returned field data type
tips: don’t change defaule in the source code file_ The collate () method can copy this code and define its own collate_ Fn() function and specify your own collet when instantiating the dataloader class_ FN function
I hope you can solve the bug as soon as possible and run through the model!

Copying a param with shape torch. Size ([262, 2048]), parameter size does not match

A parameter with shape torch. Size ([262]) is copied from the checkpoint, and the shape in the current model is torch. Size ([290]).

The parameter size of fc.weight does not match, just modify the parameter.

VIM open the corresponding file and modify the parameters

Solve the problem successfully

Reference link

 

How to Solve RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Case 1: the versions of CUDA, cudnn, Python and python do not match. Note that the python version, in particular, is easy to ignore.

Case 2 (my case is this): under the condition of case 1, check whether the graphics card used by your program is full, and if it is full, replace the unused card(‘ cuda:3 ’ —> ‘ cuda:0 ’)。

Case 3: add
torch. Backends. Cudnn. Enabled = false in the code

Successfully solved runtimeerror: CUDA runtime error (30)

RuntimeError: cuda runtime error (30)

I have been puzzled by this inexplicable error before. When I was ready to re install the driver and CUDA, I suddenly realized the root cause of this problem
the thing is, every time I want to run a program and go back to have a rest, I come back the next day to check. I’m disappointed that the program reported this error after running for a short time. Moreover, running the program after this error does not work. Only a simple and rude computer restart can solve the problem temporarily. But later found that if I have been using the computer during the day, the program will naturally run to the end, will not report such errors
so! It’s really CUDA, GPU’s problem, but it’s not because they have bugs, but because they are not playing a role in the rest, resulting in program errors! The fundamental reason is that my computer set a fixed time after the lock screen and automatically hang, so it is not CUDA. So there are such errors
If I set “don’t hang up” later, no such error will occur
write down your experience for the first time. I hope it can help you!

Python IndexError: too many indices for array: array is 1-dimensional, but 2 were i..

This is the reason why there are empty tags in the XML files in the dataset.
first of all, what are empty Tags:
& <zhoz>& </ zhoz> this form, that is, there is no value in it
normal should be & < zhoz> 56 this form

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
 
# Move the xml with empty tags and move the corresponding images synchronously
import os
import xml.etree.ElementTree as ET
import shutil
 
origin_ann_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/Annotations/'# Set the original tag path to Annos
new_ann_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/xml-save/'# Set the new tag path Annotations
origin_pic_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/JPEGImages/'
new_pic_dir = '/home/data_1/project/big-obj/RefineDet.PyTorch/data/VOCdevkit/VOC2007/pic-save/'
k=0
p=0
q=0
for dirpaths, dirnames, filenames in os.walk(origin_ann_dir):  
  for filename in filenames:
    print("process...")
    k=k+1
    print(k)
    if os.path.isfile(r'%s%s' %(origin_ann_dir, filename)):   # get the absolute path to the original xml file, isfile() detects if it is a file isdir detects if it is a directory
      origin_ann_path = os.path.join(r'%s%s' %(origin_ann_dir, filename)) # If yes, get absolute path (repeat code)
      new_ann_path = os.path.join(r'%s%s' %(new_ann_dir, filename))
      tree = ET.parse(origin_ann_path)  
      root = tree.getroot()   
      if len(root.findall('object')):
        p=p+1
      else:
        print(filename)
        old_xml = origin_ann_dir + filename
        new_xml = new_ann_dir + filename
        old_pic = origin_pic_dir + filename.replace("xml","jpg")
        new_pic = new_pic_dir + filename.replace("xml","jpg")
        q=q+1
        shutil.move(old_pic, new_pic)
        shutil.move(old_xml, new_xml)
print("ok, ",p)
print("empty, ",q)

Found the XML file that generated the empty tag. The contents are as follows:

<annotation>
	<folder>obj1344</folder>
	<filename>obj1344_frame0000172.jpg</filename>
	<path>D:\Research\valid\obj1344\obj1344_frame0000172.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>480</width>
		<height>270</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
</annotation>

After the XML file is found, the image needs to be annotated again. After opening labelimg to annotate the data, the content of the XML file is as follows:

<annotation>
	<folder>JPEGImages</folder>
	<filename>obj1344_frame0000172.jpg</filename>
	<path>D:\new\SSD-master\datasets\VOC2007\JPEGImages\obj1344_frame0000172.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>480</width>
		<height>270</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>plastic</name>
		<pose>Unspecified</pose>
		<truncated>1</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>181</xmin>
			<ymin>144</ymin>
			<xmax>355</xmax>
			<ymax>270</ymax>
		</bndbox>
	</object>
	<object>
		<name>timestamp</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>10</xmin>
			<ymin>5</ymin>
			<xmax>471</xmax>
			<ymax>43</ymax>
		</bndbox>
	</object>
	<object>
		<name>timestamp</name>
		<pose>Unspecified</pose>
		<truncated>1</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>14</xmin>
			<ymin>216</ymin>
			<xmax>480</xmax>
			<ymax>270</ymax>
		</bndbox>
	</object>
</annotation>

Modify and replace the original XML file to run.

Python RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 5, 5]

1. Problem introduction

Today, when using Python to train a model, the data set is read and preprocessed by using the functions provided by python. The network uses the custom CNN, and then there is such a small error as shown in the title when running.

2. Operation error

As follows:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 5, 5], but got 2-dimensional input of size [32, 784] instead

3. Code

First of all, my own customized CNN network is as follows:

class MNIST_Model(nn.Module):
    def __init__(self, n_in):
        super(MNIST_Model, self).__init__()

        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channels=n_in,
                      out_channels=32,
                      kernel_size=(5, 5),
                      padding=2,
                      stride=1),
        )

        self.maxp1 = nn.MaxPool2d(
                       kernel_size=(2, 2))

        self.conv2 = nn.Sequential(
            nn.Conv2d(in_channels=32,
                      out_channels=64,
                      kernel_size=(5, 5),
                      padding=0,
                      stride=1),
        )

        self.maxp2 = nn.MaxPool2d(kernel_size=(2, 2))
        
        self.fc1 = nn.Sequential(
            nn.Linear(in_features=64 * 5 * 5, out_features=200)  # Mnist
        )

        self.fc2 = nn.Sequential(
            nn.Linear(in_features=200, out_features=10),
            nn.ReLU()
        )


    def forward(self, x):
        x = self.conv1(x)
        x = self.maxp1(x)
        x = self.conv2(x)
        x = self.maxp2(x)
        x = x.contiguous().view(x.size(0), -1)
        x = self.fc1(x)
        x = self.fc2(x)
        return x

Then there is the code in the training model

#Instantiate the network, considering only the use of the CPU
model = model.MNIST_Model(1)
net = model.to(device)
# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
What is the use of #momentum:momentum factor?
optimizer = optim.SGD(model.parameters(),lr=lr,momentum=momentum)


#Start training First define the array that stores the loss function and accuracy
losses = []
acces = []
#For testing
eval_losses = []
eval_acces = []

for epoch in range(nums_epoches):
    #Clear each training first
    train_loss = 0
    train_acc = 0
    # Set the model to training mode
    model.train()
    #Dynamic learning rate
    if epoch%5 == 0:
        optimizer.param_groups[0]['lr'] *= 0.1
    for img,label in train_loader:
        #Forward propagation, passing the image data into the model
        # out outputs 10 dimensions, respectively the probability of each number, i.e. the score for each category
        out = model(img)
        # Note here that the parameter out is 64*10 and label is a one-dimensional 64
        loss = criterion(out,label)
        #backpropagation
        #optimizer.zero_grad() means to set the gradient to zero, that is, the derivative of loss with respect to weight becomes zero
        optimizer.zero_grad()
        loss.backward()
        #This method updates all the parameters, and once the gradient has been calculated by a function such as backward(), we can call this function
        optimizer.step()
        
        # Record the error 
        train_loss += loss.item()
        
        #Calculate the accuracy of the classification, find the subscript with the highest probability
        _,pred = out.max(1)
        num_correct = (pred == label).sum().item()#record the number of correct labels
        acc = num_correct/img.shape[0]
        train_acc += acc
    losses.append(train_loss/len(train_loader))
    acces.append(train_acc/len(train_loader))
    
    eval_loss = 0
    eval_acc = 0
    model.eval()
    for img,label in test_loader:
        img = img.view(img.size(0),-1)
        
        out = model(img)
        loss = criterion(out,label)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        eval_loss += loss.item()
        
        _,pred = out.max(1)
        num_correct = (pred == label).sum().item()
        acc = num_correct/img.shape[0]
        eval_acc += acc
    eval_losses.append(eval_loss/len(test_loader))
    eval_acces.append(eval_acc/len(test_loader))
    

    print('epoch:{},Train Loss:{:.4f},Train Acc:{:.4f},Test Loss:{:.4f},Test Acc:{:.4f}'
             .format(epoch,train_loss/len(train_loader),train_acc/len(train_loader),
                    eval_loss/len(test_loader),eval_acc/len(test_loader)))

4. Analyze the reasons

Locate error location

Traceback (most recent call last):
  File "train.py", line 73, in <module>
    out = model(img)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/gzdx/wyf/PARAD/model.py", line 48, in forward
    x = self.conv1(x)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 399, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/gzdx/anaconda3/envs/Torch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 396, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 1, 5, 5], but got 2-dimensional input of size [32, 784] instead

As you can see, this is roughly the result of our incoming data input into CNN network, and then due to different dimensions. Because we input four dimensions, but we get two dimensions.

  File "train.py", line 73, in <module>
    out = model(img)

5. Solutions

For this kind of problem, there are many different solutions on the Internet. This person also refers to some ideas given by others on the Internet, and then modifies them by himself, and the error is solved, as shown below:

for i,data in enumerate(train_loader):
        #Forward propagation, passing the image data into the model
        # out output 10 dimensions, respectively the probability of each number, i.e. the score of each category
        inputs, labels = data
        inputs,labels = data[0].to(device), data[1].to(device)
        # inputs torch.Size([32, 1, 28, 28])
        out = model(inputs)

The solution is also very simple. At the beginning of the training, the data will be assigned according to this reading method, and then it will be passed into the model without the above error.

6. Complete code

import numpy as np
import model
import torch

#Importing PyTorch's built-in mnist data
from torchvision.datasets import mnist

#Import pre-processing module
from torchvision import transforms
from torch.utils.data import DataLoader

#Importing neural network tools
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

#Define the hyperparameters to be used later
train_batch_size = 32
test_batch_size = 32

#Learning rate and number of training sessions
learning_rate = 0.01
nums_epoches = 50

#Parameters used when optimizer
lr = 0.1
momentum = 0.5

#Use compose to specify the preprocessor
transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize([0.5],[0.5])])

#Download the data, create a new data folder in the project folder to store the downloaded data
train_dataset = mnist.MNIST('./data', train=True, transform=transform, target_transform=None, download=False)
test_dataset = mnist.MNIST('./data', train=False, transform=transform, target_transform=None, download=False)

#Data loaders, combined datasets and samplers, and single or multi-process iterators on datasets
train_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True, num_workers=0)
test_loader = DataLoader(test_dataset, batch_size=test_batch_size, shuffle=False)

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

#Instantiate the network, considering only the use of the CPU
model = model.MNIST_Model(1)
net = model.to(device)
# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
What is the use of #momentum:momentum factor?
optimizer = optim.SGD(model.parameters(),lr=lr,momentum=momentum)




#Start the training by defining an array that stores the loss function and the accuracy
losses = []
acces = []
# test with
eval_losses = []
eval_acces = []

for epoch in range(nums_epoches):
    #Clear each training first
    train_loss = 0
    train_acc = 0
    # Set the model to training mode
    model.train()

    #动态学习率
    if epoch%5 == 0:
        optimizer.param_groups[0]['lr'] *= 0.1
    for i,data in enumerate(train_loader):
        #Forward propagation, passing the image data into the model
        # out output 10 dimensions, respectively the probability of each number, i.e. the score of each category
        inputs, labels = data
        inputs,labels = data[0].to(device), data[1].to(device)
        out = model(inputs)
        #Note here that the parameter out is 64*10 and label is 64 in one dimension
        loss = criterion(out,labels)
        #backpropagation
        #optimizer.zero_grad() means to set the gradient to zero, that is, to make the derivative of loss with respect to weight zero
        optimizer.zero_grad()
        loss.backward()
        # This method updates all the parameters, and once the gradient has been calculated by a function like backward(), we can call this function
        optimizer.step()
        
        #Record the error 
        train_loss += loss.item()
        
        # Calculate the accuracy of the classification, find the subscript with the highest probability
        _,pred = out.max(1)
        num_correct = (pred == labels).sum().item() # Record the number of correct labels
        acc = num_correct/inputs.shape[0]
        train_acc += acc
    losses.append(train_loss/len(train_loader))
    acces.append(train_acc/len(train_loader))
    print('Finished Training') 

    # save
    PATH = './model/mnist_net.pth'
    torch.save(net.state_dict(), PATH)
    
    eval_loss = 0
    eval_acc = 0
    model.eval()
    for i,data in enumerate(test_loader):
        inputs, labels = data
        inputs,labels = data[0].to(device), data[1].to(device)
        out = model(inputs)
        loss = criterion(out,labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        eval_loss += loss.item()
        
        _,pred = out.max(1)
        num_correct = (pred == labels).sum().item()
        acc = num_correct/inputs.shape[0]
        eval_acc += acc
    eval_losses.append(eval_loss/len(test_loader))
    eval_acces.append(eval_acc/len(test_loader))
    

    print('epoch:{},Train Loss:{:.4f},Train Acc:{:.4f},Test Loss:{:.4f},Test Acc:{:.4f}'
             .format(epoch,train_loss/len(train_loader),train_acc/len(train_loader),
                    eval_loss/len(test_loader),eval_acc/len(test_loader)))

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host=‘localhost‘, port=8097): Max retries excee

After using visdom, the following problem occurs.

requests.exceptions.ConnectionError: HTTPConnectionPool(host=’localhost’, port=8097): Max retries exceeded with url: /env/main (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x0000027F8769B7F0>: Failed to establish a new connection: [WinError 10061] Unable to connect due to aggressive rejection by the target computer.’))
[WinError 10061] Unable to connect because the target computer is actively rejecting.
Visdom python client failed to establish socket to get messages from the server. This feature is optional and can be disabled by initializing Visdom with `use_incoming_socket=False`, which will prevent waiting for this request to timeout.
Setting up a new session…

visdom The above problem will also occur if it is not started.
Solution.

python -m visdom.server

Facenet validate_on_lfw.py Error AssertionError: The number of LFW images must be an integer multip

When collecting Asian faces for model validation, run the following code:

python validate_on_lfw.py ../data/AsiaStar_160 ../20180402-114759 --lfw_pairs ../data/pairs_AsiaStar.txt

The error is as follows:

AssertionError: The number of LFW images must be an integer multiple of the LFW batch size

The number of pictures must be an integral multiple of the batch size. Batch size is 100, so the number of pictures must be an integral multiple of 100. In generating pairs. TXT, there must be 100 times of sample pairs.

Python AttributeError: module ‘tensorflow‘ has no attribute ‘InteractiveSession‘

Error occurred while running tensorflow:

AttributeError: module 'tensorflow' has no attribute 'InteractiveSession'

This is not the first mock exam error in the package, because the module Session has been removed in the new Tensorflow 2 version, and the code is changed to:

sess = tf.InteractiveSession()

Replace with:

sess = tf.compat.v1.InteractiveSession()

Similarly, if there are similar “TF. * *” codes in the code, you should add “compat. V1.” after them.

If you are not used to it, you can reduce the version of tensorflow

pip install tensorflow==1.14