Tag Archives: docker

[ERROR SystemVerification]: failed to parse kernel config: unable to load kernel module: “configs“

[ERROR SystemVerification]: failed to parse kernel config: unable to load kernel module: “configs“

[error systemverification]: failed to parse kernel config: unable to load kernel module: “configs”.

When installing kubernetes cluster, the above error is reported.

 

Solution:

Method 1: ignore the error

Add the –ignore-preflight-errors=SystemVerification option to ignore the error. It is not possible to tell if other problems will occur subsequently with this option.

Method 2: Upgrade kernel version

I installed the kubernetes cluster using kernel version 4.19.12, and the problem did not occur after upgrading the kernel to 5.13.7. I am not sure if it is a kernel version problem.

Method 3:

Manually compile the config kernel module

 

[Solved] Kubernetes Error: failed to list *core.Secret: unable to transform key

While installing a Kubernetes local cluster, I happened to encounter the following problem:

E0514 07:30:58.627632 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:30:59.631509 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:30:59.631563 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…
W0514 07:31:00.633540 1 reflector.go:324] storage/cacher.go:/secrets: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input
E0514 07:31:00.633575 1 cacher.go:424] cacher (*core.Secret): unexpected ListAndWatch error: failed to list *core.Secret: unable to transform key “/registry/secrets/default/default-token-nk77g”: invalid padding on input; reinitializing…

 

Reason:

We know that after running the cluster master, we need to create the TLS Bootstrap Secret to provide an automatic visa using.

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Secret
metadata:
  name: bootstrap-token-${TOKEN_ID}
  namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
  token-id: "${TOKEN_ID}"
  token-secret: "${TOKEN_SECRET}"
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"
  auth-extra-groups: system:bootstrappers:default-node-token
EOF

secret "bootstrap-token-65a3a9" created

where BOOTSTRAP_TOKEN=T O K E N I D . {TOKEN_ID}.TOKEN
I

D.{TOKEN_SECRET} can be found in bootstrap-kubelet.conf.

One of the reasons for the problem shown in the title is that the command may have been run multiple times and multiple secrets exist, e.g. the node side was found to be not working properly and a bootstrap-kubelet.conf was regenerated for it, etc.

Then when installing the kubernetes cluster manually, we will find that the online information is backward after all, so we will use the kubeadm post-installation information for comparison and verification, and then I accidentally added the following codes:

spec:
hostNetwork: true
priorityClassName: system-cluster-critical
securityContext:
seccompProfile:
type: RuntimeDefault

spec.securityContext.seccompProfile.type=RuntimeDefault, this setting will automatically generate a self-signed secret when the cluster is running, which will lead to a contradiction with the manual generation and the problem in the title.

 

Solution:

1) First clear the cluster cache, delete all files under /var/lib/etcd/ and /var/lib/kubelet/, and keep the config.xml file in the latter.
2) Delete the spec.securityContext.type=”seccompProfile” in /etc/kubernetes/manifests under kube-apiserver.yml, kube-controller-manager.yml and kube-scheduler.yml. seccompProfile.type=RuntimeDefault.
3) Re-run the kubelet: systemctl start kubelet and you are done.

[Solved] Docker-maven-plugin Build Mirror Error: failed: Connection refused: connect

Docker-maven-plugin Build Mirror Error: failed: Connection refused: connect

[ERROR] Failed to execute goal com.spotify:dockerfile-maven-plugin:1.4.7:build (default) on project security-api: Could not build image: java.util.concurrent.ExecutionException: com.spotify.docker.client.sha
ded.javax.ws.rs.ProcessingException: com.spotify.docker.client.shaded.org.apache.http.conn.HttpHostConnectException: Connect to localhost:2375 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connect
ion refused: connect -> [Help 1]

 

Solution:

  1. Go to docker settings (Second mouse click on docker icon, click settings)
  2. Click on checkbox ‘Expose daemon on tcp://localhost:2375 without TLS’
  3. Apply & Restart
  4. Run again: mvn clean install

[Solved] error getting credentials – err: exit status GDBus.Error:org.freedesktop.DBus.Error.ServiceU

Scene

When using docker-compose to build the environment, the following error messages appear

error getting credentials - err: exit status 1, out: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.secrets was not provided by any .service files

Solution:

sudo apt install gnupg2 pass

[Solved] Error response from daemon: driver failed programming external connectivity on endpoint mysql

Error response from daemon: driver failed programming external connectivity on endpoint mysql

docker command:
docker start container_name/id
Container Start Error:

Error response from daemon: driver failed programming external connectivity on endpoint mysql (cf1ba9f9e0613e14f42332d187a51429f8213aaf91d775f2ec3600614c78e6e1): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 3306 -j DNAT --to-destination 172.17.0.2:3306 ! -i docker0: iptables: No chain/target/match by that name.
(exit status 1))
Error: failed to start containers: mysql

 

Solution: restart docker:systemctl restart docker

https://blog.csdn.net/qq_45652428/article/details/124870923

How to Solve kubelet starts error (k8s Cluster Restarted)

How to Solve kubelet starts error after k8s Cluster is Restarted

After the k8s cluster restarts, kubelet starts to solve the error

1 k8s version 1.23.0, docker CE version 20.10.14

2. An error is reported for the problem, and an error is reported for starting kubelet. The contents are as follows:

May 16 09:47:13 k8s-master kubelet: E0516 09:47:13.512956    7403 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
May 16 09:47:13 k8s-master systemd: kubelet.service: main process exited, code=exited, status=1/FAILURE
May 16 09:47:13 k8s-master systemd: Unit kubelet.service entered failed state.
May 16 09:47:13 k8s-master systemd: kubelet.service failed

3 problem analysis: according to the error report, the reason should be that kubelet’s cgroups are inconsistent with docker

4. Solve the problem and modify the docker configuration

cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF

5. Restart docker to solve the problem

[[email protected] ~]# systemctl restart docker
[[email protected] ~]# systemctl restart kubelet
[[email protected] ~]# 
[[email protected] ~]# systemctl status  kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Mon 2022-05-16 09:48:06 CST; 3s ago
     Docs: https://kubernetes.io/docs/
 Main PID: 8226 (kubelet)
    Tasks: 23
   Memory: 56.9M
   CGroup: /system.slice/kubelet.service
           ├─8226 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config...
           └─8745 /opt/cni/bin/calico

[Solved] standard_init_linux.go:190: exec user process caused “exec format error“

Scene

In the process of packaging the golang application into a docker image, execute the following command

docker run -it -P --name docker_client -m 1024m --net host docker_client:1.0

After execution, the server reported this error

standard_init_linux.go:190: exec user process caused "exec format error"

It’s useless to follow the online method. I can run normally on the virtual machine. I’ll look at my dockerfile carefully later

FROM golang:alpine

ENV GO111MODULE=on \
    GOPROXY=https://goproxy.cn,direct \
    CGO_ENABLED=0 \
    GOOS=linux \
    GOARCH=amd64

# Create an apps directory in the container root directory
WORKDIR /build

# Copy the go_docker_demo1 executable from the current directory
COPY . .

# Compile our code into a binary executable app
RUN go build -o app .

# Move to the /dist directory where the generated binaries are stored
WORKDIR /dist

# Copy the binaries from the /build directory to here
RUN cp /build/app .

# Expose the port
EXPOSE 8080

# The command to run the golang program
CMD ["/dist/app"]

It is found that the goarch parameter is AMD64. Check the relevant version of the server later

 docker version
 #check the version of the docker

A problem was found in the output information. One line of parameters is arm64

 OS/Arch:           linux/arm64

So I modified the dockerfile file

FROM golang:alpine

ENV GO111MODULE=on \
    GOPROXY=https://goproxy.cn,direct \
    CGO_ENABLED=0 \
    GOOS=linux \
    GOARCH=arm64

# Create an apps directory in the container root directory
WORKDIR /build

# Copy the go_docker_demo1 executable from the current directory
COPY . .

# Compile our code into a binary executable app
RUN go build -o app .

# Move to the /dist directory where the generated binaries are stored
WORKDIR /dist

# Copy the binaries from the /build directory to here
RUN cp /build/app .

# Expose the port
EXPOSE 8080

# The command to run the golang program
CMD ["/dist/app"]

After rebuilding the dockerfile image, it will run normally.

[Solved] k8s kubeadmin init Error: http://localhost:10248/healthz‘ failed

Error Messages:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
        Unfortunately, an error has occurred:
                timed out waiting for the condition
        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'
        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.
        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

 

Use the command to find the startup error reason:

systemctl status kubelet -l
kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: activating (auto-restart) (Result: exit-code) since 四 2022-04-14 19:12:05 CST; 7s ago
     Docs: https://kubernetes.io/docs/
  Process: 4796 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
 Main PID: 4796 (code=exited, status=1/FAILURE)
4月 14 19:12:05 K8SMASTER01 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
4月 14 19:12:05 K8SMASTER01 kubelet[4796]: E0414 19:12:05.862353    4796 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
4月 14 19:12:05 K8SMASTER01 systemd[1]: Unit kubelet.service entered failed state.
4月 14 19:12:05 K8SMASTER01 systemd[1]: kubelet.service failed.

 

Solution:

[[email protected] ~]# cat > /etc/docker/daemon.json <<EOF
> {"exec-opts": ["native.cgroupdriver=systemd"]}
> EOF
[[email protected] ~]# systemctl restart docker

[Solved] docker failed to solve: failed to solve with frontend dockerfile.v0: failed to create LLB definition

D:\docker_devops\ec>docker-compose up
[+] Building 0.6s (3/3) FINISHED
 => [internal] load build definition from php5-Dockerfile                                                          0.4s
 => => transferring dockerfile: 315B                                                                               0.0s
 => [internal] load .dockerignore                                                                                  0.4s
 => => transferring context: 2B                                                                                    0.0s
 => ERROR [internal] load metadata for docker.registry.xxxxx.com:5000/develop/php-with-supervisor:5.6.5              0.1s
------
 > [internal] load metadata for docker.registry.xxxxx.com:5000/develop/php-with-supervisor:5.6.5:
------
failed to solve: failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to do request: Head "https://docker.registry.xxxxx.com:5000/v2/develop/php-with-supervisor/manifests/5.6.5": http: server gave HTTP response to HTTPS client

D:\docker_devops\ec>

Error Messages:

failed to solve: failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to do request: Head "https://docker.registry.xxxx.com:5000/v2/develop/php-with-supervisor/manifests/5.6.5": http: server gave HTTP response to HTTPS client

 

Solution:
Add the following configuration in Engine of docker destorp Setting
“features”: {
“buildkit”: false
}

[Solved] Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

Problem description

Failed to get convolution algorithm. This is probably because cuDNN failed to initialize


Cause analysis:

Insufficient GPU memory

Solution:

1.kill -9 PID_*
2. Dynamically allocate GPU memory:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

stream copy error: reading from a closed fifo [How to Solve]

The docker service on the Linux server cannot be started after running for a period of time

Record a docker service problem encountered at the customer’s site

Problem description

After running the docker service for a period of time, some services are killed and cannot be restarted successfully through docker compose. Check the docker service log and report an error stream copy error: reading from a closed FIFO

Troubleshooting process:

1. The initial positioning is that there is not enough memory. Check with Free -g and find that the content is enough

2 After searching the Internet, some bloggers said that restarting docker could solve the problem. After restarting docker, they found that the error changed, and the service that didn’t get up before still couldn’t get up. The error became failed to allocate network resources for node *****
3 The network of docker service is the default. Regardless of the problem of docker network, changing the stack name and restarting still won’t work
4 Docker service PS ID/docker service logs ID check the service log that failed to start. It is found that the error log is still stream copy error: reading from a closed FIFO
5 Finally, before restarting the server, check the disk with the df -h command and find that the/dev/mapper/centosroot disk is full

Solution:

Go to cd /var/log to delete some useless log files, if the current log files are small, you can use du -sh in the root directory to view those folders occupy a lot of space, generally the /var folder and /root folder will occupy the root disk, you need to delete the contents of these two folders

Error: error from slirp4netns while setting up port redirection: map[desc:bad request: add_hostfwd:

Causes: accidentally use kill -9 to delete the Podman container running java program process, start the container again with an error

Description of the problem: The problem is because the port is occupied, can not start

Solution: This is because the port is occupied, you can only find out which process is occupying the port through netstat (I am showing the PID/slirp4netns ), and then kill the process number of the occupied port PID with kill -9