Category Archives: Error

Tensorflow Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS

  Server environment:

    Ubuntu 16.04.4tensorflow 1.13.1cuda-10.0cudnn 7.4.5

Recently, when I was running demo pointasnl of point cloud classification, when batch_ When the size setting is relatively large, the following errors will appear during the training:

2020-06-12 00:14:01.824110: E tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2020-06-12 00:14:01.824142: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:273] Unexpected Event status: 1

At first, it was thought that there was something wrong with the GPU Programming code, but after repeated checking, it was found that there was no error.

After collecting information from the Internet, I vaguely realized that it should be the environmental version.

After reducing cudnn 7.4.5 to cudnn 7.3.1, this problem seems to be solved. I hope there will be no more problems.

[Solved] JSON parse error: Unexpected character (‘‘‘ (code 39)): was expecting double-quote to star

Warning: [http-nio-8080-exec-6] org.springframework.web.servlet.mvc.support.DefaultHandlerExceptionResolver.handleHttpMessageNotReadable Failed to read HTTP message: org.springframework.http.converter.HttpMessageNotReadableException: JSON parse error: Unexpected character (''' (code 39)): was expecting double-quote to start field name; nested exception is com.fasterxml.jackson.core.JsonParseException: Unexpected character (''' (code 39)): was expecting double-quote to start field name
 at [Source: (PushbackInputStream); line: 1, column: 3]

Error reason: the format of JSON parameter is wrong
error screenshot

single quotation mark ‘and double quotation mark’ error, which is modified to

perfect solution

[Solved] Kubeadm Reset error: etcdserver: re-configuration failed due to not enough started members

Error information:

[root@bogon log]# kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed?[y/N]: y
[preflight] Running pre-flight checks
[reset] Removing info for node "bogon" from the ConfigMap "kubeadm-config" in the "kube-system" Namespace
{"level":"warn","ts":"2021-07-03T08:19:14.041-0400","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7295b53f-6c7d-4a5e-8795-ab4b33048049/192.168.28.128:2379","attempt":0,"error":"rpc error: code = Unknown desc = etcdserver: re-configuration failed due to not enough started members"}
{"level":"warn","ts":"2021-07-03T08:19:14.096-0400","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7295b53f-6c7d-4a5e-8795-ab4b33048049/192.168.28.128:2379","attempt":0,"error":"rpc error: code = Unknown desc = etcdserver: re-configuration failed due to not enough started members"}

Solutions:

Execute the following two commands

rm -rf /etc/kubernetes/*
rm -rf /root/.kube/

Then execute it again

kubeadm reset

Linux C++ Error: invalid use of incomplete type [How to Solve]

Reason: the compiler doesn’t know the concrete implementation of struct or class,

Analysis: it usually occurs in the following situations: suppose we have a class some defined in some. H, implemented in some. CPP, and we need to use the method of this some in other. CPP, so we can declare a class some in other. H, and declare the method to be used. This will lead to the above problems,

Solution: include some. H in other.cpp, so the compiler will find the specific definition of class some according to the header file, and the problem will be solved.

Details:

1. Forget to define the header file

2. There is no reference header file

AIX 11G Rac Startup Error: CLSU-00100,CLSU-00101,CLSU-00103,CLSU-00104,CRS-4000

During the test, the database of Aix small computer and Oracle 11g Rac is cloned to a new server, and the original disk group is remounted with the same disk number and sequence, An error is reported when starting has:
clsu-00100: operating system function: opendir failed with error data: 1
clsu-00101: operating system error message: nosuch file or directory
clsu-00103: error location: scrsearch1
clsu-00104: additional error information: can open SCR home dir SCLS_ scr_ getval
CRS-4000: Command Start failed, or completed with errors.

It literally means the problem of the operating system. Check the IP address, hosts file, and host name to see if there is any problem.

Check the current GI installation host name (the host name will be recorded when grid is installed)
CD/etc/Oracle/SCLS_ scr/
[ oracle@edbjr2p2 scls_ SCR] $LS
Rac1
current host name
$host name
Rac2

Confirm that the host name has been modified. Other ways on the Internet say that has and grid need to be reconfigured.
after looking at the process, it’s rather cumbersome. Anyway, the host is cloned. Sima is a live doctor. He directly asks his colleagues in the host group to modify the host name and IP address. After confirming that there is no problem with the hosts content, restart the system and start the cluster and database again. OK

Grpc Error: failed to unmarshal the received message proto: can‘t skip unknown wire type 7

Recently, I encountered a problem when using golang grpc stream.

After receiving data for a period of time, the receiver suddenly prints grpc:   failed   to   unmarshal   the   received   message   proto:   can’t   skip   unknown   wire   type   seven

And it’s not going to recover.

Check the source code, found that this error, generally is the incoming message out of the problem, resulting in parsing failure.

It is also possible that the Pb used is not updated, resulting in failure of alignment and parsing.

So I tried to update all the Pb used, and found that this problem has not been solved.

Then we put the doubt on the message.

Looking at the code, we found that after the sender sent the data, this part of the data was modified by other cooperators!

[Solved] Could not find a package configuration file provided by “moveit_core“

Error resolution

Could not find a package configuration file provided by "moveit_core" with
  any of the following names:

    moveit_coreConfig.cmake
    moveit_core-config.cmake

  Add the installation prefix of "moveit_core" to CMAKE_PREFIX_PATH or set
  "moveit_core_DIR" to a directory containing one of the above files.  If
  "moveit_core" provides a separate development package or SDK, be sure it
  has been installed.

Postgres uuid_generate_v1() does not exist [How to Solve]

 

1、 Phenomenon:

schema=# select uuid_generate_v1();

ERROR: function uuid_generate_v1() does not exist

Line 1: select uuid_generate_v1();

^

Warning: No function matches the given name and argument types. You might need to add explicit type casts.

Time: 14.543 ms

2、 Reason:

By default, Postgres has no UUID_ generate_ V1 method, you need to install the extension and enable UUID ossp to use this method.

3、 Solution:

1. Install UUID ossp extension dependent environment

You can use compile install or Yum install, because my environment here is postgres10, yum install, so the yum install extension is preferred.

yum install -y postgresql10-contrib

2. Enable UUID ossp extension

postgres=# create extension "uuid-ossp" ;

CREATE EXTENSION

3. Verification

postgres-# \dx

Installed extensions list

Name | Version | Architecture Mode | Description

-----------+------+------------+-------------------------------------------------

plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language

uuid-ossp | 1.1 | public | generate universally unique identifiers (UUIDs)

More details

postgres=# \dx+
        Object used to extend "plpgsql"
               Object description                
---------------------------------------
 Function plpgsql_call_handler()
 function plpgsql_inline_handler(internal)
 function plpgsql_validator(oid)
 Language plpgsql
(4 rows of records)

     object for extending "uuid-ossp"
             Object description             
----------------------------------
 Function uuid_generate_v1()
 function uuid_generate_v1mc()
 function uuid_generate_v3(uuid,text)
 function uuid_generate_v4()
 function uuid_generate_v5(uuid,text)
 function uuid_nil()
 function uuid_ns_dns()
 function uuid_ns_oid()
 function uuid_ns_url()
 function uuid_ns_x500()
(10 lines of records)

Execute again

postgres=# select uuid_generate_v1();
           uuid_generate_v1           
--------------------------------------
 9ffd8cc8-db44-11eb-8952-0050568a41b8
(1 line record)

Leetcode error: address sanitizer: detailed analysis and solution of deadlysignal

Leetcode error: address sanitizer: detailed analysis and solution of deadlysignal

Problem description, problem analysis, case analysis, error source code, OK code after source code analysis and solution

For more summary, please refer to: C brush questions: leetcode brush questions step on the pit common bug summary

Problem description


Error: addresssanitizer: deadlysignal, details are as follows

===42====ERROR:AddressSanitizer: SEGV on unknown address xx. 
The signal is caused by a READ memory access.

Problem analysis


In general, there may be the following problems:

Out of bounds, the array reference exceeds the left and right bounds, infinite recursion, the code can not normally end, return function input and output parameters, return processing error

According to the above ideas, the example code is analyzed to solve the bug.

Case analysis


The questions come from leetcode topic 46. Please refer to the problem analysis blog for details. Paste the problem code of the above error in the first edition:

Error source code

Main key code of the first layer:

/**
 * Return an array of arrays of size *returnSize.
 * The sizes of the arrays are returned as *returnColumnSizes array.
 * Note: Both returned array and *columnSizes array must be malloced, assume caller calls free().
 */
int g_trackNum; // Used for temporary stacking during recursive calls
int g_rowPos;

// Subfunction declaration
int isContanin(int *nums, int len, int val);
void backtrack(int *nums, int numsSize, int **returnColumnSizes, int *track);

// Main call function
int** permute(int* nums, int numsSize, int* returnSize, int** returnColumnSizes)
{
    // Calculate all possible totals
    int row = 1, i;
    for (i = numsSize; i > 0; i--) {
        row *= i;
    }
    *returnSize = row;

    printf("row = %d\n", row);

    // Request a two-dimensional array of the corresponding size and allocate space
    returnColumnSizes = (int **)malloc((row + 10) * sizeof(int*));
    if (returnColumnSizes == NULL) {
        return NULL;
    }
    int *p;
    for (i = 0; i < row; i++) {
        p = (int*)malloc((numsSize + 10) * sizeof(int));
        if (p == NULL) {
            return NULL;
        }
        returnColumnSizes[i] = p;
    }
    p = (int*)malloc(numsSize * sizeof(int));
    if (p == NULL) {
        return NULL;
    }
    int *track = p;

    // Backtrack to exhaust all possible permutations
    g_trackNum = 0;
    g_rowPos = 0;
    backtrack(nums, numsSize, returnColumnSizes, track); // Put the result from line 0

    // Returns returnSize and a two-dimensional pointer
    return returnColumnSizes;
}

Recursive code for backtracking implementation:

void backtrack(int *nums, int numsSize, int **returnColumnSizes, int *track)
{
    // Reach the leaf node track join returnColumSizes, the recorded path has been equal to the length of the array stop
    int i;
    if (g_trackNum == numsSize) {
        // printf("back: g_rowPos = %d\n", g_rowPos);
        for (i = 0; i < numsSize; i++) {
            // printf("back: g_rowPos = %d\n", g_rowPos);
            returnColumnSizes[g_rowPos][i] = track[i];
        }
        g_rowPos++;
        return;
    }

    // Recursive traversal
    for (i = 0; i < numsSize; i++) {
        // check if the current value is in the track
        if (isContanin(track, g_trackNum, nums[i])) {
            continue;
        }

        // If not, join track
        // printf("back: g_trackNum = %d\n", g_trackNum);
        track[g_trackNum++] = nums[i];
        
        // Continue traversing backwards
        backtrack(nums, numsSize, returnColumnSizes, track);
        // After the node returns, retrieve the value in the track
        g_trackNum--;
    }

    return;
}

Sub function to determine whether the current value has been traversed

int isContanin(int *nums, int len, int val)
{
    int flag = 0;
    int i;
    for (i = 0; i < len; i++) {
        if (nums[i] == val) {
            flag = 1;
            break;
        }
    }
    return flag;
}

Source code analysis

check possible problem 1 : out of bounds, array reference out of bounds

There are two main ideas:
1. Try to allocate enough space first to see if it is a space problem
2. Print subscript value in advance where it may cross the boundary to see if it overflows. Because address disinfection is interrupted at runtime, you can use printf to print the situation before the interruption.

Method 1

At each possible cross-border reference, the subscript is printed in advance to record the subscript coefficient printed before the program crashes. An example of the printing code is as follows: printf ("row = D/N", row)printf("back: g_ rowPos = %d\n", g_ rowPos);printf("back: g_ trackNum = %d\n", g_ trackNum);

Method 2

When initializing the array allocation space, forcibly allocates enough space to ensure that the space is sufficient. If there is no error after increasing the space, it means that the array reference is out of bounds

After running the code, it was found that subscript printing was normal, and no problem was found, so we continued to investigate possible problem 2.

check possible problem 2 : infinite recursion, the code can’t end and return normally

Print the output record directly in the termination condition of the recursive function backtrack, and observe whether the recursion is carried out according to the expected recursion mode. If there is no print record, then the function is not terminated and recursion is infinite all the time

After running the code, it is found that there is no such problem.

check possible problem 3 : error in the return processing of function input and output parameters
after carefully reading the input and output instructions of the first line of the code and comparing the implementation of online C code, it is found that the understanding of output parameters is wrong.

Variable secondary pointer returncolumnsizes stores the number of columns output in each row. Although the number of columns in the title is fixed, it needs to be assigned to the corresponding number of columns. And I first understood that this is the return pointer of a two-dimensional array. The return pointer of the two-dimensional array is passed through the function return parameter. The first address of the two-dimensional array allocated by direct return can be used.

After modifying the above problems, the code output is normal and no error is reported.

OK code after solution

// Determine if an element has been traversed
int isContain(int *nums, int len, int val)
{
    int flag = 0;
    int i;
    for (i = 0; i < len; i++) {
        if (nums[i] == val) {
            flag = 1;
            break;
        }
    }
    return flag;
}

// Note that this global variable is best defined by declaration only and initialized before backtrack
// in case the LeetCode judge does not re-initialize it, resulting in a miscalculation
int g_trackNum; // Used for temporary stacking during recursive calls
int g_rowPos; // Record each row

void backtrack(int *nums, int numsSize, int **returnColumnSizes, int *track)
{
    // Reach the leaf node track join returnColumSizes, the recorded path has been equal to the length of the array stop
    int i;
    if (g_trackNum == numsSize) {
        for (i = 0; i < numsSize; i++) {
            returnColumnSizes[g_rowPos][i] = track[i];
        }
        g_rowPos++;
        return;
    }

    // Recursive traversal
    for (i = 0; i < numsSize; i++) {
        // check if the current value is in the track
        if (isContain(track, g_trackNum, nums[i])) {
            continue;
        }

        // If not, add track
        track[g_trackNum++] = nums[i];
        // continue traversing backwards
        backtrack(nums, numsSize, returnColumnSizes, track);
        // After the node returns, retrieve the value in track
        g_trackNum--;
    }

    return;
}

int** permute(int* nums, int numsSize, int* returnSize, int** returnColumnSizes)
{
    // Calculate the total number of all possible n!
    int row = 1, i;
    for (i = numsSize; i > 0; i--) {
        row *= i;
    }
    *returnSize = row;

    // Calculate the number of columns in each row of the returned array
    *returnColumnSizes = (int *)malloc(sizeof(int) * (*returnSize));
    if (returnColumnSizes == NULL) {
        return NULL;
    }
    for (int i = 0; i < row; i++) {
        returnColumnSizes[0][i] = numsSize;
    }

    // Request a two-dimensional array of the corresponding size and allocate space
    int **res = (int **)malloc((row + 10) * sizeof(int*));
    if (res == NULL) {
        return NULL;
    }
    int *p;
    for (i = 0; i < row; i++) {
        p = (int*)malloc((numsSize + 10) * sizeof(int));
        if (p == NULL) {
            return NULL;
        }
        res[i] = p;
    }
    p = (int*)malloc(numsSize * sizeof(int));
    if (p == NULL) {
        return NULL;
    }
    int *track = p;

    // Backtrack to exhaust all possible permutations
    g_trackNum = 0;
    g_rowPos = 0;
    backtrack(nums, numsSize, res, track); // put the result from row 0

    return res;
}

[Solved] Kafka2.3.0 Error: Timeout of 60000ms expired before the position for partition could be determined

Flink consumption kafka2.3.0, wrong report, wrong partition allocation

Kafka Client Timeout of 60000ms expired before the position
 for partition could be determined

I found a wave on the Internet, but I didn’t find the reason. Later, I found that it was because of Kafka’s configuration file, server. Properties,   The host name is used as the configuration, which is added in server. Properties

Host. Name = 192.168.0.30 (the IP address of the current server is OK), and each Kafka node should be equipped with its own IP address

[Solved] Adb Shell Monkey Warning: can‘t create log.txt, Read-only file system

Problem description

When using monkey to debug Android programs, the shell prompts an error, and the problem is as follows
first, I use

adb shell

Enter the device shell, and then use the following command to execute the monkey test
command command

monkey -p com.baidu.BaiduMap -v -v -v --ignore-crashes --throttle 200 --pct-touch 50 5 1>D:\\info.txt 2>D:\error.txt

tips

/system/bin/sh: can't create D:\info.txt: Read-only file system

Analysis and correction

Reason: the reason why the file cannot be written is that the device shell only has read and write permissions on the device, but not on the PC. Therefore, you should not execute monkey after entering the ADB shell, but exit the shell first

exit

It is then executed on the command line

adb shell monkey -p com.baidu.BaiduMap -v -v -v --ignore-crashes --throttle 200 --pct-touch 50 5 1>D:\\info.txt 2>D:\error.txt

[Solved] FileUploadException: the request was rejected because no multipart boundary was found

summary

Recently, in order to do a video detection, when using postman to upload a video, the code threw an error:

ERROR 13557 [] [http-nio-5000-exec-8] org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet] Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.web.multipart.MultipartException: Failed to parse multipart servlet request; nested exception is java.io.IOException: org.apache.tomcat.util.http.fileupload.FileUploadException: the request was rejected because no multipart boundary was found] with root cause

org.apache.tomcat.util.http.fileupload.FileUploadException: the request was rejected because no multipart boundary was found
        at org.apache.tomcat.util.http.fileupload.impl.FileItemIteratorImpl.init(FileItemIteratorImpl.java:178)

Background

File upload controller of business application layer:

@Resource
private FileService fileService;

@PostMapping(value = "video")
@ApiOperation(value = "Submit video recognition", httpMethod = "POST")
public Result video(@RequestParam("file") MultipartFile file, @LoginUser SysUser user, HttpServletRequest request) {
	Result<FileInfo> fileInfoResult = fileService.fileUpload(ArmConstant.BUCKET, file);
}

The file service in the above code is an interface defined based on feignclient

@FeignClient(name = ServiceNameConstants.FILE_SERVICE, fallbackFactory = FileServiceFallbackFactory.class, decode404 = true)
public interface FileService {
	@PostMapping(value = "/upload/{bucket}", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
	Result<FileInfo> fileUpload(@PathVariable("bucket") String bucket, @RequestParam("file") MultipartFile file);
}

The file service needs a supporting service, namely file center service, whose controller is defined as follows:

@RestController
@Slf4j
public class FileController {
	@PostMapping("/upload/{bucket}")
	public Result<FileInfo> fileUpload(@PathVariable("bucket") String bucket, @RequestParam("file") MultipartFile file) {
	}
}

analysis

Baidu and Google search all talked about how to set up postman, which once made me question the problem of postman, or my use of postman. Online search, we must take their own critical thinking analysis, many articles are blindly copied. Moreover, as a mature automatic interface testing tool, postman will hardly be found by you and me.

But I’m familiar with the use of postman. In fact, there is no problem with postman configuration:

shelve for one night.

It’s really impossible. Global search multipartfile :

suddenly find something. What’s the difference between @requestpart and @requestparam

Replace @requestparam in three places with @requestpart to solve the problem;

I have to say that feign has a lot of holes. However, we can’t find any information about fileuploadexception: the request was rejected because no multipart boundary was found in feign GitHub.

Feign’s pit, refer to my another blog fallback factory

Extension

What is the difference between @requestpart and @requestparam
write another article about the difference between @requestparam and @requestpart and feign’s comments