Tag Archives: Cloud primordial

Zookeeper Failed to Start Error: start failed [How to Solve]

=====There is no error prompt in ZK here. Look for the logs. There is a logs directory on the upper layer of the bin directory. Check the logs===

Main error log information:

2022-07-28 15:31:50,793 [myid:] - ERROR [main:QuorumPeerMain@98] - Invalid config, exiting abnormally

========Solution information========

cd apache-zookeeper-3.6.2-bin/conf/

View the VIM zoo configuration file, view the dataDir directory defined by yourself, and view the ID information of zkServer.

CD to the dataDir directory,

echo “serverid” > myid

Go to the bin directory and start again, sh zkServer.sh start
check ZK application status: sh zkServer.sh status, (if the startup is successful and the status is wrong, you can wait until the cluster machine starts completely).

========================End=====================

[Solved] Kubernetes Error: failed to list *core.Secret: unable to transform key

While installing a Kubernetes local cluster, I happened to encounter the following problem:

E0514 07:30:58.627632 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:30:59.631509 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:30:59.631563 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:31:00.633540 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:31:00.633575 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…

Reason:

We know that after running the cluster master, we need to create the TLS Bootstrap Secret to provide an automatic visa using.

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Secret
metadata:
  name: bootstrap-token-${TOKEN_ID}
  namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
  token-id: "${TOKEN_ID}"
  token-secret: "${TOKEN_SECRET}"
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"
  auth-extra-groups: system:bootstrappers:default-node-token
EOF

secret "bootstrap-token-65a3a9" created

where BOOTSTRAP_TOKEN=T O K E N I D . {TOKEN_ID}.TOKEN
I

D.{TOKEN_SECRET} can be found in bootstrap-kubelet.conf.

One of the reasons for the problem shown in the title is that the command may have been run multiple times and multiple secrets exist, e.g. the node side was found to be not working properly and a bootstrap-kubelet.conf was regenerated for it, etc.

Then when installing the kubernetes cluster manually, we will find that the online information is backward after all, so we will use the kubeadm post-installation information for comparison and verification, and then I accidentally added the following codes:

spec:
hostNetwork: true
priorityClassName: system-cluster-critical
securityContext:
seccompProfile:
type: RuntimeDefault

spec.securityContext.seccompProfile.type=RuntimeDefault, this setting will automatically generate a self-signed secret when the cluster is running, which will lead to a contradiction with the manual generation and the problem in the title.

Solution:

1) First clear the cluster cache, delete all files under /var/lib/etcd/ and /var/lib/kubelet/, and keep the config.xml file in the latter.
2) Delete the spec.securityContext.type=”seccompProfile” in /etc/kubernetes/manifests under kube-apiserver.yml, kube-controller-manager.yml and kube-scheduler.yml. seccompProfile.type=RuntimeDefault.
3) Re-run the kubelet: systemctl start kubelet and you are done.

[Solved] Gerrit Error: Permission denied publickey

Gerrit reports an error: permission denied solution

Foreword solution

Foreword

When using the Gerrit clone code, you will find an error. The error message is probably: permission denied (publickey)

openssh has abandoned RSA encryption keys since version 8.8 for security reasons
openssh thinks that RSA cracking costs too little, so it is disabled if there is a risk
you can use the command:
ssh -v [git server]

Check the openssh version number of the Gerrit server.
if it is ≥ 8.8, you can use this method.

Solution

Enter the machine SSH directory,
create a new config file without suffix:

The content is:

Host gerrit's IP or domain name
HostName gerrit's IP or domain name
User Gerrit's user name (e.g. zhangsan)
PubkeyAcceptedKeyTypes +ssh-rsa
IdentityFile ~/.ssh/id_rsa
Port 29418 (Gerrit port)

Once configured, the clone is OK. Generally, there is no problem.

[Solved] pod Error: back off restarting failed container

pod Error: back off restarting failed container

Solution:

1. Find the corrosponding deployment
2. Add command: [ “/bin/bash”, “-ce”, “tail -f /dev/null” ]
as following:

kind: Deployment
apiVersion: apps/v1beta2
metadata:
  labels:
    app: jenkins-master
  name: jenkins-master-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jenkins-master
  template:
    metadata:
      labels:
        app: jenkins-master
    spec:
      containers:
      - name: jenkins-master
        image: drud/jenkins-master:v0.29.0
        imagePullPolicy: IfNotPresent
        command: [ "/bin/bash", "-ce", "tail -f /dev/null" ]
        volumeMounts:
        - mountPath: /var/jenkins_home/
          name: masterjkshome
        ports:
        - containerPort: 8080
      volumes:
      - name: masterjkshome
        persistentVolumeClaim:
          claimName: pvcjkshome

[Solved] Error contacting service. It is probably not running.

First, check whether more than half of the servers start zookeeper. If yes, use the JPS command and find that the quorumpeermain main class is not started

[atguigu@Hadoop103 zookeeper-3.5.7]$ jps
14850 Jps

The most likely cause: Zookeeper decompression path in the conf folder zoo.cfg (I am here after changing the name) configuration when adding content after the addition of spaces and the creation of myid up and down there are empty lines or left and right spaces, enter the file deleted, and then check the jps

#######################cluster########################## 
server.2=hadoop102:2888:3888 
server.3=hadoop103:2888:3888 
server.4=hadoop104:2888:3888

[Solved] Error contacting service. It is probably not running.

First, check whether more than half of the servers start zookeeper. If yes, use the JPS command and find that the quorumpeermain main class is not started

[atguigu@Hadoop103 zookeeper-3.5.7]$ jps
14850 Jps

The most likely cause: Zookeeper decompression path in the conf folder zoo.cfg (I am here after changing the name) configuration when adding the content after the addition of a space and the creation of myid up and down there are empty lines or left and right spaces, enter the file deleted, and then check the jps

#######################cluster########################## 
server.2=hadoop102:2888:3888 
server.3=hadoop103:2888:3888 
server.4=hadoop104:2888:3888

[Solved] Error response from daemon: OCI runtime create failed: container with id exists: xxzxxxxxxxx

problem

Error response from daemon: OCI runtime create failed: container with id exists:xxzxxxxxxxx

Solution:

rm -rf /var/run/docker/runtime-runc/moby/xxzxxxxxxxx

Using openfeign to remotely call the startup program to report an error

Add openfeign directly to POM

        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-openfeign</artifactId>
        </dependency>

Program startup report found

No Feign Client for loadBalancing defined. Did you forget to include spring-cloud-starter-loadbalancer?

Because the spring boot and spring cloud versions are higher, the loadbalancer dependency needs to be added

The problem is that there is no loadalanc, but note that the ribbon in Nacos will cause the loadalanc package to fail. Add it in the common POM

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
    <exclusions>
        <exclusion>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-ribbon</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-loadbalancer</artifactId>
    <version>3.0.3</version>
</dependency>