Tag Archives: container

Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to

Installed calico using the tigera-operator method and reported an error after startup, all calico related pods show CrashLoopBackoff.

kubectl -n calico-system describe pod calico-node-2t8w6 and found the following error.

Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/ run/calico/ bird.ctl: connect: no such file or directory.

Cause of the problem:

We are experiencing this issue during a Kubernetes Cluster deployment. Since Calico automatically detects IP addresses by default using the first-found method and gets the wrong address, we need to specify the detection method manually.

1. Remove all the claico

kubectl -n tigera-operator get deployments.apps -o yaml > a.yaml
kubectl -n calico-system get daemonsets.apps calico-node -o yaml > b.yaml
kubectl -n calico-system get deployments.apps calico-kube-controllers -o yaml > c.yaml
kubectl -n calico-system get deployments.apps calico-typha -o yaml > d.yaml
kubectl -n calico-apiserver get deployments.apps calico-apiserver -o yaml > e.yaml
kubectl delete -f a.yaml
kubectl delete -f b.yaml
kubectl delete -f c.yaml
kubectl delete -f d.yaml
kubectl delete -f e.yaml
2. Remove custom-resources.yaml
kubectl delete -f tigera-operator.yaml
kubectl delete -f custom-resources.yaml

3. Remove vxlan.calico
ip link delete vxlan.calico

4. Modify custom-resources.yaml file and add nodeAddressAutodetectionV4:
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/v3.23/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
#bgp: Enabled
#hostPorts: Enabled
ipPools:
– blockSize: 26
cidr: 10.244.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
#linuxDataplane: Iptables
#multiInterfaceMode: None
nodeAddressAutodetectionV4:
interface: ens.*

# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/v3.23/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
5. Re-create
kubectl create -f tigera-operator.yaml
kubectl create -f custom-resources.yaml
check
kubectl -n calico-system get daemonsets.apps calico-node  -o yaml|grep -A2 IP_AUTODETECTION_METHOD

Cannot start container 39f96c64a9c6: [8] System error: exec format error

An error occurred when Docker started the container!

Error response from daemon: Cannot start container 39f96c64a9c6: [8] System error: exec format error
FATA[0000] Error: failed to start one or more containers

Cause: Running a 64 bit Docker image on a 32-bit system

To view the system version:

  • uname -a
  • lsb_release -a


Solution:

Install and use docker in Ubuntu 14.04 (32-bit)


New Error as below:

FATA[0000] Post http:///var/run/docker.sock/v1.18/images/create?fromSrc=ubuntu%3A14.04&repo=: dial unix /var/run/docker.sock: permission denied. Are you trying to connect to a TLS-enabled daemon without TLS?

Solution: Switch to root administrator status to solve the problem!

[ERROR Swap]: running with swap on is not supported. Please disable swap

Failed to install kubeadm, report the following error as below:

[root@k8s1 yum.repos.d]# kubeadm init   –apiserver-advertise-address=192.168.12.10   –image-repository registry.aliyuncs.com/google_containers   –kubernetes-version v1.18.0   –service-cidr=10.96.0.0/12   –pod-network-cidr=10.244.0.0/16
W0928 15:17:23.161858    1999 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.0
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver. The recommended driver is “systemd”. Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `–ignore-preflight-errors=…`
To see the stack trace of this error execute with –v=5 or higher

How to Solve:

Need to turn off swap in linux

# Turn off swap, run both commands to solve the problem
swapoff -a # temporary
sed -ri ‘s/. *swap.*/#&/’ /etc/fstab # permanent

 

[Solved] Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables –wait -t na…

Error message: Failed to Setup IP tables: Unable to enable SKIP DNAT rule: (iptables failed: iptables –wait -t nat -I DOCKER -i br-b1938128a963 -j RETURN: iptables: No chain/target/match by that name. (exit status 1))

reason: The error report after operating the Linux firewall on/off operate.

Solution:

restart docker via the following command:

service docker restart

[Solved] docker Error response from daemon driver failed programming external connectivity on endpoint lamp

Docker containers do port mapping error:

docker: Error response from daemon: driver failed programming external connectivity on endpoint lamp3 (46b7917c940f7358948e55ec2df69a4dec2c6c7071b002bd374e8dbf0d40022c): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 86 -j DNAT --to-destination 172.17.0.2:80 ! -i docker0: iptables: No chain/target/match by that name.

 

Solution:

The custom chain DOCKER defined at the start of the docker service is cleared

Restart and it will be OK! systemctl restart docker

[Solved] Docker Error: driver failed programming external connectivity on endpoint

1. Error information

Cannot start service nacos: driver failed programming external
connectivity on endpoint yingxue_nacos_1
(3e83b70dcd6ba020d1ee4cf61ffeac58dbf9aea3bbbdad69c7ed44f5cf40ad1a):
(iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0
–dport 8848 -j DNAT --to-destination 172.18.0.2:8848 ! -i br-2e393ccf4803: iptables: No chain/target/match by that name.

2. Solutions

The user-defined chain DOCKER is cleared for some reason when the docker service is started. Restart docker, and then restart naocs

systemctl restart docker
docker restart 540

[Solved] Compose error “HTTP request took too long to complete“

Recently, I noticed that the following error messages often appear in docker-compose:

ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Adding COMPOSE_HTTP_TIMEOUT seems to only delay the error. Is this a known issue? Or is there a workaround?
Tried increasing the COMPOSE_HTTP_TIMEOUT environment variable and it didn’t work.

# Increase timeout period to 120 seconds.
export COMPOSE_HTTP_TIMEOUT=120;
# Rebuild all containers using the new images.
docker-compose up -d;

# or use docker-compose --verbose up -d to check out the errors

My Solution:

sudo service docker restart
docker-compose up

Maybe I have a high inode

df -ih			#  -i, --inodes   List information about information nodes, not data block usage
Filesystem     Inodes IUsed IFree IUse% Mounted on
udev             493K   390  492K    1% /dev
tmpfs            494K   537  494K    1% /run
/dev/xvda1       1.3M  1.2M   70K   95% /
tmpfs            494K     1  494K    1% /dev/shm
tmpfs            494K     3  494K    1% /run/lock
tmpfs            494K    16  494K    1% /sys/fs/cgroup
tmpfs            494K     4  494K    1% /run/user/1000

[Solved] error: password authentication failed for user “postgres”

password authentication failed for user “postgres” with docker

Steps:

    1. 1. run the instruction to create a docker container.

docker run --rm --name test-postgres -p 5432:5432 -e POSTGRES_PASSWORD=pw -d postgres

    1. 2. run the following command in node code to connect to the database:
import pg from 'pg'
const { Pool } = pg
pool = new Pool({
   database: 'postgres',
   user: 'postgres',
   password: 'pw',
   port: 5432
})

The following error is thrown:
error: password authentication failed for user “postgres”

 

Cause analysis

Because Postgres has been installed locally, Postgres is automatically started when the system is started by default. When connecting to the database, the locally installed Postgres service is preferentially connected, so the connection fails.

 

Solution:

Open Window Task Manager, under Services you will see a postgres service running. Right-click to turn off the postgres service, open the Service window and double-click on the postgres service to set the Startup Type to Manual.
then


Other troubleshooting methods

Open the Terminal of the container

  1. Enter the following two commands to see if there is a problem with local host permissions.
    cd var/lib/postgresql/data
    cat pg_hba.conf
  2. To see if the default user is postgres.
    psql -U postgres -x -c "select * from current_user;"
  3. Check the password expiration date. rolvaliduntil no value means no expiration date.
    psql -h 127.0.0.1 -U postgres -d postgres
    SELECT * FROM pg_roles WHERE rolname='postgres';
  4. Try clearing docker’s columns and containers, and restarting docker

[Solved] Docker Download Mirror Error: Cannot connect to the Docker daemon at…

If you want to download an image with docker, you will report an error

cannot connect to the docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

cause analysis: maybe docker did not exit normally last time, so docker did not start normally, and the docker process could not be found in the corresponding /var/run/ path.

Solution:
enter the command

systemctl start docker.service

Then you can download

How to Solve K8S Error getting node

During the installation or operation of k8s cluster, you may encounter problems of "error getting node", such as:

"Error getting node" err="node \"master\" not found"
dial tcp 10.8.126.46:6443: connect: connection refused"
"Error getting node" err="node \"master\" not found"
"Error getting node" err="node \"master\" not found"

The way to troubleshoot such problems is to execute the following commands to check the specific error causes:

journalctl -xeu kubelet

Find the initial error and deal with it according to different errors
according to the problems I have encountered, there are mainly the following possibilities:

  1. No swap memory disabled
  2. There is a problem with hostname setting or hosts setting (other bloggers listed reasons)
  3. The container and k8s version is not compatible (other bloggers listed reasons)

Failed to remove multipath map 320b508ca45022b80 [How to Solve]

Failed to remove multipath map 320b508ca45022b80

1. Project scenario

Host os:kylin-server-10-sp1-release-build02-20210518-arm64
docker:docker-ce-18.09.7
cloud: openstack queens
storage: same acs5000
VM os: kylin-server-10-sp1-release-build02-20210518-arm64


2. Problem description and cause analysis

2.1 problem description

The volume-based virtual machine can be created normally, but the error after restarting the virtual machine, checking the logs of nova-compute, found that it reports ProcessExecutionError:unexpected error while running command. command: multipah -f 320b508ca45022b80 failed, map in use, failed to remove multipath map 320b508ca45022b80.
I manually executed multipah -f 320b508ca45022b80, and it did report the status of in use, so I suspect that there are processes using the volume, and I found that the same volume group name was activated through lvdisplay, vgdisplay and lsblk, so I suspect that the virtual machine and the physical machine used the same volume group name, and the volume group name was activated after the virtual machine started. The VM has been activated, and the process of reactivating all logical volumes in the volume group failed, resulting in multipath -f failure. Therefore, we need to configure lvm to activate only the logical volumes of the system, check the system volumes by lsblk, and then configure accordingly, edit /etc/lvm/lvm.conf and modify the following content

devices {
        filter = [ "a/sda/", "r/.*/" ]
}
allocation {
       volume_list = ["klas"]
       auto_activation_volume_list = ["klas"]
}

Restart service:

systemctl restart lvm2-lvmetad.service lvm2-lvmetad.socket

Re create the virtual machine and restart it. It is also recommended that the virtual machine adopt other volume group names

2.2 storage configuration

2.2.1 drive

Use the same driver version zeus-driver-3.1.2.000106, copy the driver to the cinder_volume container /usr/lib/python2.7/site-packages/cinder/volume/drivers/ directory and the cinder_backup container /usr/lib/python2.7/site-packages/cinder/backup/drivers/ directory, and restart the related services.

2.2.2 configure cinder volume

vim /etc/kolla/cinder-volume/cinder.conf

[DEFAULT]
enabled_backends=toyou_ssd
[toyou_ssd]
volume_driver = cinder.volume.drivers.zeus.Acs5000_iscsi.Acs5000ISCSIDriver
san_ip = x.x.x.x
use_mutipath_for_image_xfer = True
image_volume_cache_enabled = True
san_login = cliuser
san_password = ******
acs5000_volpool_name = toyou_ssd
acs5000_target = 0
volume_backend_name = toyou_ssd

Restart the cinder-volume service. For others, please refer to the “reference scheme”


3. Solutions

View the adopted system disk through lsblk, and then edit /etc/lvm/lvm.conf to modify the following contents

devices {
        filter = [ "a/sda/", "r/.*/" ]
}
allocation {
       volume_list = ["klas"]
       auto_activation_volume_list = ["klas"]
}

Restart service:

systemctl restart lvm2-lvmetad.service lvm2-lvmetad.socket

Note that it is mainly the filter. The drive letter in the filter is determined by the system disk recognized by lsblk, which may be SDB or nvme, etc

[Solved] onlyoffice Error: error self signed certificate and download failed

When Installing nextcloud+onlyoffice, onlyoffice failed to start and report an error:


enter the container to see the error information of out.log

[root@nextcloud ~]# docker ps -a
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
CONTAINER ID  IMAGE                                       COMMAND     CREATED       STATUS           PORTS                                        NAMES
a7c97fb93556  docker.io/onlyoffice/documentserver:latest              30 hours ago  Up 30 hours ago  0.0.0.0:8080->80/tcp, 0.0.0.0:9000->443/tcp  onlyoffice
[root@nextcloud ~]# docker exec -it a7c97fb93556 /bin/bash
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
root@a7c97fb93556:/# cd /var/log/onlyoffice/documentserver/converter/
root@a7c97fb93556:/var/log/onlyoffice/documentserver/converter# ls
err.log  out.log-20220729
root@a7c97fb93556:/var/log/onlyoffice/documentserver/converter#

Disabling Document Server Access Authentication
Next, disable access authentication for Document Server, which by default rejects unauthenticated requests (i.e., self-signed HTTPS requests).

I am now running Document Server with Docker, using the docker exec command to log into the container.

There seems to be only the nano editor in the container, but that’s enough.

Open /etc/onlyoffice/documentserver/default.json, go down and find the rejectUnauthorized field and change its value to false.

Restart the container.
Modify default.json

root@a7c97fb93556:/var/log/onlyoffice/documentserver/converter# cd /etc/onlyoffice/
root@a7c97fb93556:/etc/onlyoffice# ls
documentserver  documentserver-example
root@a7c97fb93556:/etc/onlyoffice# cd documentserver
root@a7c97fb93556:/etc/onlyoffice/documentserver# ls
default.json              local.json  production-linux.json
development-linux.json    log4js      production-windows.json
development-mac.json      logrotate   supervisor
development-windows.json  nginx
root@a7c97fb93556:/etc/onlyoffice/documentserver# pwd
/etc/onlyoffice/documentserver
root@a7c97fb93556:/etc/onlyoffice/documentserver#

Modify as follows: “rejectunauthorized”: false

                     "requestDefaults": {
                                "headers": {
                                        "User-Agent": "Node.js/6.13",
                                        "Connection": "Keep-Alive"
                                },
                                "gzip": true,
                                "rejectUnauthorized": false
                        },

Restart container

root@a7c97fb93556:/etc/onlyoffice/documentserver# exit
exit
[root@nextcloud ~]# docker stop a7c97fb93556
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Error: given PIDs did not die within timeout
[root@nextcloud ~]# docker start a7c97fb93556
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Error: unable to start container "a7c97fb93556650c83dd763f9578705a82f34b2673f9759e8d0ce62afc63e77c": container a7c97fb93556650c83dd763f9578705a82f34b2673f9759e8d0ce62afc63e77c must be in Created or Stopped state to be started: container state improper
[root@nextcloud ~]# reboot

Restart nextcloud

login as: root
[email protected]'s password:
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Fri Jul 29 15:59:59 2022 from 192.168.182.1
[root@nextcloud ~]# setenforce 0
[root@nextcloud ~]# systemctl start https
Failed to start https.service: Unit https.service not found.
[root@nextcloud ~]# systemctl start httpd
Enter TLS private key passphrase for localhost:443 (RSA) : ******
[root@nextcloud ~]# docker ps -a
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
CONTAINER ID  IMAGE                                       COMMAND     CREATED       STATUS      PORTS                                        NAMES
a7c97fb93556  docker.io/onlyoffice/documentserver:latest              31 hours ago  Created     0.0.0.0:8080->80/tcp, 0.0.0.0:9000->443/tcp  onlyoffice
[root@nextcloud ~]# docker start a7c97fb93556
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
a7c97fb93556
[root@nextcloud ~]#

Start onlyoffice

run as prompted

[root@nextcloud ~]# sudo docker exec a7c97fb93556 sudo supervisorctl start ds:example
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
sudo: unable to send audit message: Operation not permitted
ds:example: started

Successfully opened word document