Tag Archives: docker

Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to

Installed calico using the tigera-operator method and reported an error after startup, all calico related pods show CrashLoopBackoff.

kubectl -n calico-system describe pod calico-node-2t8w6 and found the following error.

Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/ run/calico/ bird.ctl: connect: no such file or directory.

Cause of the problem:

We are experiencing this issue during a Kubernetes Cluster deployment. Since Calico automatically detects IP addresses by default using the first-found method and gets the wrong address, we need to specify the detection method manually.

1. Remove all the claico

kubectl -n tigera-operator get deployments.apps -o yaml > a.yaml
kubectl -n calico-system get daemonsets.apps calico-node -o yaml > b.yaml
kubectl -n calico-system get deployments.apps calico-kube-controllers -o yaml > c.yaml
kubectl -n calico-system get deployments.apps calico-typha -o yaml > d.yaml
kubectl -n calico-apiserver get deployments.apps calico-apiserver -o yaml > e.yaml
kubectl delete -f a.yaml
kubectl delete -f b.yaml
kubectl delete -f c.yaml
kubectl delete -f d.yaml
kubectl delete -f e.yaml
2. Remove custom-resources.yaml
kubectl delete -f tigera-operator.yaml
kubectl delete -f custom-resources.yaml

3. Remove vxlan.calico
ip link delete vxlan.calico

4. Modify custom-resources.yaml file and add nodeAddressAutodetectionV4:
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/v3.23/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
#bgp: Enabled
#hostPorts: Enabled
ipPools:
– blockSize: 26
cidr: 10.244.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
#linuxDataplane: Iptables
#multiInterfaceMode: None
nodeAddressAutodetectionV4:
interface: ens.*

# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/v3.23/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
5. Re-create
kubectl create -f tigera-operator.yaml
kubectl create -f custom-resources.yaml
check
kubectl -n calico-system get daemonsets.apps calico-node  -o yaml|grep -A2 IP_AUTODETECTION_METHOD

[Solved] docker Startup Error: Job for docker.service failed because the control process exited with error code

I. Error

docker service startup error: sudo systemctl restart docker, prompt: Job for docker.service failed because the control process exited with error code. See “systemctl status docker.service” and “journalctl -xe” for details.
check the status fo the service: systemctl status docker.service

II. Problem-solving
1. Enter the docker directory: cd /etc/docker/
2. Modify the type of daemon: mv daemon.json daemon.conf
3. Restart docker: systemctl restart docker

Cannot start container 39f96c64a9c6: [8] System error: exec format error

An error occurred when Docker started the container!

Error response from daemon: Cannot start container 39f96c64a9c6: [8] System error: exec format error
FATA[0000] Error: failed to start one or more containers

Cause: Running a 64 bit Docker image on a 32-bit system

To view the system version:

  • uname -a
  • lsb_release -a


Solution:

Install and use docker in Ubuntu 14.04 (32-bit)


New Error as below:

FATA[0000] Post http:///var/run/docker.sock/v1.18/images/create?fromSrc=ubuntu%3A14.04&repo=: dial unix /var/run/docker.sock: permission denied. Are you trying to connect to a TLS-enabled daemon without TLS?

Solution: Switch to root administrator status to solve the problem!

[Solved] Docker Filed to Start Container: Error response from daemon: network XXX not found

[root@xxx dc-gitlab]# docker start cce932ba5dc2
Error response from daemon: network ase6cd78ccf7f24c49871653f2dd not found
Error: failed to start containers: css932ba5dd3

The above are error messages. The previous bridging is configured.

 

Solution:

docker-compose up -d --force-recreate

It can be solved.

Scene:

During the production launch, the port of the gitlab started by Docker cannot be accessed suddenly. It is OK to check the corresponding server listening port. It can be pinged, but the telnet port is not. In an hurry to go online, I had to restart the server and Docker, and then the above problem occurred.

[ERROR Swap]: running with swap on is not supported. Please disable swap

Failed to install kubeadm, report the following error as below:

[root@k8s1 yum.repos.d]# kubeadm init   –apiserver-advertise-address=192.168.12.10   –image-repository registry.aliyuncs.com/google_containers   –kubernetes-version v1.18.0   –service-cidr=10.96.0.0/12   –pod-network-cidr=10.244.0.0/16
W0928 15:17:23.161858    1999 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.0
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver. The recommended driver is “systemd”. Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `–ignore-preflight-errors=…`
To see the stack trace of this error execute with –v=5 or higher

How to Solve:

Need to turn off swap in linux

# Turn off swap, run both commands to solve the problem
swapoff -a # temporary
sed -ri ‘s/. *swap.*/#&/’ /etc/fstab # permanent

 

[Solved] OCI runtime create failed: /usr/bin/nvidia-container-runtime did not terminate successfully: unknown

Docker build Image error:

OCI runtime create failed: /usr/bin/nvidia-container-runtime did not terminate successfully: unknown
Root cause: need to install nvidia-container-runtime
How to Solve:

1. Online installation

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.repo |\
tee /etc/yum.repos.d/nvidia-container-runtime.repo
yum install nvidia-container-runtime

2. If stuck on connecting to nvidia.github.io

1) yum -y install yum-utils

2) mkdir ~/nvidia && cd ~/nvidia

3) repotrack nvidia-container-runtime

4)rpm -Uvh –force –nodeps *.rpm

[Solved] OCI runtime create failed: runc create failed: unable to start container process:

OCI runtime create failed: run create failed: unable to start container process: exec: “env”: executable file not found in $PATH: unknown

The above error occurs when running the docker container.

Reason:
The image given to me by others has been decompressed, and I was told that I used load to load the image.

docker load < ***.tar

The command reports an error
The error message is: open/var/lib/locker/tmp/locker import - **************/repositories: no such file or directory.

So I used docker import to load the image, and surprisingly it loaded successfully, and there was no problem of 0kb as mentioned on the internet.

docker import ***.tar docker:v1

But there was a problem at runtime.

docker run

The following error message appears:
Error response from daemon: failed to create shim task: OCI runtime create failed: run create failed: disable to start container process: exec: "env": executable file not found in $PATH: unknown

Solution:

tar -xvf	***.tar

After decompression, load the image again. Success!

[Solved] Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables –wait -t na…

Error message: Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables –wait -t nat -I DOCKER -i br-b1938128a963 -j RETURN: iptables: No chain/target/match by that name. (exit status 1))

reason: The error report after operating the Linux firewall on/off operate.

Solution:

restart docker via the following command:

service docker restart

[Solved] kubectl top pod error: error: Metrics API not available

k8s version: v1.24.4

kubectl top pod error: error: Metrics API not available
Error: Readiness probe failed: HTTP probe failed with statuscode: 500
vim custom-resources.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1
          #image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

#execute
kubectl apply -f custom-resources.yaml
#view pod
kubectl get pod -A |grep me

[Solved] docker Error response from daemon driver failed programming external connectivity on endpoint lamp

Docker containers do port mapping error:

docker: Error response from daemon: driver failed programming external connectivity on endpoint lamp3 (46b7917c940f7358948e55ec2df69a4dec2c6c7071b002bd374e8dbf0d40022c): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 86 -j DNAT --to-destination 172.17.0.2:80 ! -i docker0: iptables: No chain/target/match by that name.

 

Solution:

The custom chain DOCKER defined at the start of the docker service is cleared

Restart and it will be OK! systemctl restart docker

[Solved] Docker Elasticsearch8.4.0 Error: Exception in thread “main” java.nio.file.FileSystemException

Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.Dym72YkCRZ-GMAliqWE2IA.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
	at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
	at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:266)
	at java.base/java.nio.file.Files.move(Files.java:1432)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1127)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1139)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
	at org.elasticsearch.server.cli.ServerCli.autoConfigureSecurity(ServerCli.java:161)
	at org.elasticsearch.server.cli.ServerCli.execute(ServerCli.java:85)
	at org.elasticsearch.common.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:54)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:85)
	at org.elasticsearch.cli.Command.main(Command.java:50)
	at org.elasticsearch.launcher.CliToolLauncher.main(CliToolLauncher.java:64)

Cause: it is estimated that there is a problem with the mounting of the configuration file

My solution: when docker starts, it can run successfully without mounting the configuration file

1. The address bar cannot access port 9200

It needs to be added in elasticsearch.yml in the container

http.host: 0.0.0.0

2. After the above method is configured, you need to input the account password to access the 9200 port

After finding some solutions, you need to configure the following contents in the configuration file in the container

xpack.secruity.enabled: false

After restarting the container, ES can be run successfully

Note: VIM needs to be installed before editing the files in the container

[Solved] k8s Error: Back-off restarting failed container

1. Cause

When I run Ubuntu through k8s, I execute the following script

#!/bin/bash
service ssh start
echo root:$1|chpasswd

After the container is started, there is no resident foreground process inside the container, which causes the container to exit after the container is started successfully, thus continuing to restart.

2. Solution

At startup, perform a task that will never be completed

command: ["/bin/bash", "-ce", "tail -f /dev/null"]

Add the following to the script above:

#!/bin/bash
service ssh start
echo root:$1|chpasswd
tail -f /dev/null

Successfully solved, this script can be executed successfully, and the container can be started successfully