Es Container Error: too many open files [How to Solve]


Previously, a middleware was started with k8s cluster, which internally relies on ES for data storage. Looking at the log, we found that ES reported an error "too many open files


Enter the container and execute ulimit - a to view the file handle limit of the current user. If it is found that it meets the requirements, the problem may only occur on the host machine. Execute cat/etc/sysctl.Conf to view the handle count configuration, as follows:

[[email protected] ~]# cat /etc/sysctl.conf

65536 is much smaller than 104857 with log errors. The next step is to modify the number of handles of the host. There are two ways to modify: one is to modify the file of a single process, and the other is to modify the system configuration

Modify the number of file handles of a single process
to view the number of file handles that a process can open, you can use cat/proc/<pid>/Limits view. To dynamically modify the limits of a process, you can use the prlimit command. The specific usage is: prlimit -- PID ${PID} -- nofile = 102400:102400 the number of files can be defined by yourself.
for my es container, you can execute PS - EF | grep elasticsearch to find the PID of the ES process, My process PID is 23571 . Execute prlimit -- PID 23571 -- nofile = 104857 to modify the number of file handles of the 23571 process to 104857 modify the system configuration
modify the fs.file-max parameter under the/etc/sysctl.conf file, and execute the sysctl - P command to take effect, or modify the /etc/security/limits.conf configuration file

cat  /etc/security/limits.conf
*                soft    nproc          655350
*                hard    nproc          655350
*                soft    nofile         655350
*                hard    nofile         655350

How much should this value be set
priority (open file descriptors):
soft limit & lthard limit < kernel< the limit caused by the data structure used to achieve the maximum number of file descriptors
in fact, there is no specific limit on this value, but if the allocated value is too large, it will affect the system performance, so it should be balanced according to the specific application/etc/security/limits.conf is the limit on the number of handles at the user level, and /etc/sysctl. Conf is the kernel parameter at the system level

