Tag Archives: Cloud primordial

Zookeeper Failed to Start Error: start failed [How to Solve]

=====There is no error prompt in ZK here. Look for the logs. There is a logs directory on the upper layer of the bin directory. Check the logs===

Main error log information:

2022-07-28 15:31:50,793 [myid:] - ERROR [main:QuorumPeerMain@98] - Invalid config, exiting abnormally

========Solution information========

cd apache-zookeeper-3.6.2-bin/conf/

View the VIM zoo configuration file, view the dataDir directory defined by yourself, and view the ID information of zkServer.

CD to the dataDir directory,

echo  “serverid”  > myid

Go to the bin directory and start again, sh zkServer.sh start
check ZK application status: sh zkServer.sh status, (if the startup is successful and the status is wrong, you can wait until the cluster machine starts completely).

========================End=====================

[Solved] ZooKeeper Configurate Error: Error contacting service. It is probably not running.

After the ZooKeeper download and decompression configuration is successfully started, the execution of zkServer.sh start reports the following error.

The reason why I report an error is that the 2 and 3 node jdks do not set environment variables, resulting in an error contacting service It is probably not running.

I only configured JDK environment variables on node 1, and did not configure JDK environment variables on nodes 2 and 3. (I have three machines configured here)

Execute the VIM /etc/profile command to set the environment variables.

After setting the environment variable, use the command: source /etc/profile to make it effective

After the JDK environment variables of the three machines are set successfully.

Check zookeeper status at startup: zkServer.sh status  three nodes started successfully

Successfully resolved.

[Solved] Kubernetes Error: failed to list *core.Secret: unable to transform key

While installing a Kubernetes local cluster, I happened to encounter the following problem:

E0514 07:30:58.627632 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:30:59.631509 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:30:59.631563 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:31:00.633540 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:31:00.633575 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…

 

Reason:

We know that after running the cluster master, we need to create the TLS Bootstrap Secret to provide an automatic visa using.

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Secret
metadata:
  name: bootstrap-token-${TOKEN_ID}
  namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
  token-id: "${TOKEN_ID}"
  token-secret: "${TOKEN_SECRET}"
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"
  auth-extra-groups: system:bootstrappers:default-node-token
EOF

secret "bootstrap-token-65a3a9" created

where BOOTSTRAP_TOKEN=T O K E N I D . {TOKEN_ID}.TOKEN
I

D.{TOKEN_SECRET} can be found in bootstrap-kubelet.conf.

One of the reasons for the problem shown in the title is that the command may have been run multiple times and multiple secrets exist, e.g. the node side was found to be not working properly and a bootstrap-kubelet.conf was regenerated for it, etc.

Then when installing the kubernetes cluster manually, we will find that the online information is backward after all, so we will use the kubeadm post-installation information for comparison and verification, and then I accidentally added the following codes:

spec:
hostNetwork: true
priorityClassName: system-cluster-critical
securityContext:
seccompProfile:
type: RuntimeDefault

spec.securityContext.seccompProfile.type=RuntimeDefault, this setting will automatically generate a self-signed secret when the cluster is running, which will lead to a contradiction with the manual generation and the problem in the title.

 

Solution:

1) First clear the cluster cache, delete all files under /var/lib/etcd/ and /var/lib/kubelet/, and keep the config.xml file in the latter.
2) Delete the spec.securityContext.type=”seccompProfile” in /etc/kubernetes/manifests under kube-apiserver.yml, kube-controller-manager.yml and kube-scheduler.yml. seccompProfile.type=RuntimeDefault.
3) Re-run the kubelet: systemctl start kubelet and you are done.

[Solved] Gerrit Error: Permission denied publickey

Gerrit reports an error: permission denied solution

Foreword solution

Foreword

When using the Gerrit clone code, you will find an error. The error message is probably: permission denied (publickey)

openssh has abandoned RSA encryption keys since version 8.8 for security reasons
openssh thinks that RSA cracking costs too little, so it is disabled if there is a risk
you can use the command:
ssh -v [git server]

Check the openssh version number of the Gerrit server.
if it is ≥ 8.8, you can use this method.

Solution

Enter the machine SSH directory,
create a new config file without suffix:

The content is:

Host gerrit's IP or domain name
HostName gerrit's IP or domain name
User Gerrit's user name (e.g. zhangsan)
PubkeyAcceptedKeyTypes +ssh-rsa
IdentityFile ~/.ssh/id_rsa
Port 29418 (Gerrit port)

Once configured, the clone is OK. Generally, there is no problem.

[Solved] pod Error: back off restarting failed container

pod Error: back off restarting failed container

 

Solution:

1. Find the corrosponding deployment
2. Add command: [ “/bin/bash”, “-ce”, “tail -f /dev/null” ]
as following:

kind: Deployment
apiVersion: apps/v1beta2
metadata:
  labels:
    app: jenkins-master
  name: jenkins-master-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jenkins-master
  template:
    metadata:
      labels:
        app: jenkins-master
    spec:
      containers:
      - name: jenkins-master
        image: drud/jenkins-master:v0.29.0
        imagePullPolicy: IfNotPresent
        command: [ "/bin/bash", "-ce", "tail -f /dev/null" ]
        volumeMounts:
        - mountPath: /var/jenkins_home/
          name: masterjkshome
        ports:
        - containerPort: 8080
      volumes:
      - name: masterjkshome
        persistentVolumeClaim:
          claimName: pvcjkshome

[Solved] Error contacting service. It is probably not running.

First, check whether more than half of the servers start zookeeper. If yes, use the JPS command and find that the quorumpeermain main class is not started

[atguigu@Hadoop103 zookeeper-3.5.7]$ jps
14850 Jps

The most likely cause: Zookeeper decompression path in the conf folder zoo.cfg (I am here after changing the name) configuration when adding content after the addition of spaces and the creation of myid up and down there are empty lines or left and right spaces, enter the file deleted, and then check the jps

#######################cluster########################## 
server.2=hadoop102:2888:3888 
server.3=hadoop103:2888:3888 
server.4=hadoop104:2888:3888

[Solved] Error contacting service. It is probably not running.

First, check whether more than half of the servers start zookeeper. If yes, use the JPS command and find that the quorumpeermain main class is not started

[atguigu@Hadoop103 zookeeper-3.5.7]$ jps
14850 Jps

The most likely cause: Zookeeper decompression path in the conf folder zoo.cfg (I am here after changing the name) configuration when adding the content after the addition of a space and the creation of myid up and down there are empty lines or left and right spaces, enter the file deleted, and then check the jps

#######################cluster########################## 
server.2=hadoop102:2888:3888 
server.3=hadoop103:2888:3888 
server.4=hadoop104:2888:3888

[Solved] Gunicorn timeout error: worker timeout

Gunicorn timeout error: worker timeout

I. Problem Description:

One morning, the developer suddenly reported a failure and the container restarted inexplicably. After checking the business container log, the worker timeout field was found

II. Analysis of error reporting reasons:

It can be seen from the error message that the worker process of gunicorn timed out, causing the process to exit and restart. Check the official website. The official website explains that the default timeout of gunicorn is 30s. If it exceeds 30s, the worker process will be killed and restarted.

III Solution:

Add: -- timeout 600 to gunicorn’s startup command to set the timeout to 600 seconds– Graceful timeout 600 indicates that the graceful timeout is 600 seconds

After the setting is completed, it is verified through kustomize inspection and re-distribution. It is found that the problem does not occur in the follow-up

ANASYS2020R2 mesh script error [How to Solve]

Problem situation

When calculating workbench 2020r2, the following screen suddenly appears:

Solution:

The scheme is as follows:
manually register the DLLs listed below by starting a command prompt (DOS) with “run as administrator”, then typing:
regsvr32.exe ole32.dll
regsvr32.exe atl.dll
regsvr32.exe oleaut32.dll
regsvr32.exe scrrun.dll
regsvr32.exe jscript.dll
regsvr32.exe vbscript.dll

make sure they all say succeeded
now open regular CMD prompt, and type the following command
Move “% appdata%\ANSYS”% appdata%\ANSYS.Old ”
when solving the problem, both ordinary CMD and administrator mode CMD can be used. Just call the interface directly from Win + R. Bold codes can be entered one by one.

Eureka server startup error: cannot execute request on any known server

Article catalog

Error reporting screenshot reference: self inspection modification

Error reporting screenshot reference:

Self inspection

As soon as a new Eureka project was established, an error was reported. After checking the startup class and YML configuration file, it was found that the format of YML configuration file was wrong. The error is as follows:

Modification

There is an obvious indentation error in the screenshot, which makes Eureka unable to start normally. After modifying the indentation, it is as follows:

Screenshot of successful startup of Eureka server:

If the configuration file in YML format is prone to errors, we recommend that you use the configuration file in properties format; As follows:

Using openfeign to remotely call the startup program to report an error

Add openfeign directly to POM

        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-openfeign</artifactId>
        </dependency>

Program startup report found

No Feign Client for loadBalancing defined. Did you forget to include spring-cloud-starter-loadbalancer?

Because the spring boot and spring cloud versions are higher, the loadbalancer dependency needs to be added

The problem is that there is no loadalanc, but note that the ribbon in Nacos will cause the loadalanc package to fail. Add it in the common POM

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
    <exclusions>
        <exclusion>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-ribbon</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-loadbalancer</artifactId>
    <version>3.0.3</version>
</dependency>