Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for earlyoom for containerized environments. #331

Open
amannijhawan opened this issue Nov 21, 2024 · 1 comment
Open

Support for earlyoom for containerized environments. #331

amannijhawan opened this issue Nov 21, 2024 · 1 comment

Comments

@amannijhawan
Copy link

Currently, earlyoom calculates memory availability and limits by reading /proc/meminfo, which works well in non-containerized environments or directly on the host. However, in Kubernetes pods, memory statistics such as memavail and mem max should be derived from the cgroup limits rather than the host's /proc/meminfo. This mismatch causes earlyoom to incorrectly calculate memory usage when running inside a Kubernetes pod.

For our use case, we aim to prevent Kubernetes from killing the entire pod when a single errant process exceeds its memory limit. While the ideal solution would be to isolate the process in a separate container, this is not feasible for us due to architectural constraints. Instead, we would like to use earlyoom within the pod to catch the errant process and terminate it before the pod breaches its memory limit.

We have successfully deployed earlyoom inside a pod with the proc filesystem mounted, and it runs as expected in terms of functionality. However, since earlyoom reads node-level memory stats from /proc/meminfo, it does not honor the pod's cgroup memory limits.

[root@almalinux8 /]# rpm -ivh https://kojipkgs.fedoraproject.org//packages/earlyoom/1.6.2/1.el8/x86_64/earlyoom-1.6.2-1.el8.x86_64.rpm
Retrieving https://kojipkgs.fedoraproject.org//packages/earlyoom/1.6.2/1.el8/x86_64/earlyoom-1.6.2-1.el8.x86_64.rpm
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:earlyoom-1.6.2-1.el8             ################################# [100%]
[root@almalinux8 /]# earlyoom 
earlyoom 1.6.2
mem total: 60278 MiB, swap total:    0 MiB
sending SIGTERM when mem <= 10.00% and swap <= 10.00%,
        SIGKILL when mem <=  5.00% and swap <=  5.00%
mem avail: 48602 of 60278 MiB (80.63%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48592 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48591 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48589 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48590 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48595 of 60278 MiB (80.62%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48586 of 60278 MiB (80.60%), swap free:    0 of    0 MiB ( 0.00%)
^C
[root@almalinux8 /]# earlyoom ^C
[root@almalinux8 /]# 

Proposed Solution:

  • Update parse_meminfo: Modify parse_meminfo to support reading memory statistics from the cgroup filesystem (/sys/fs/cgroup) when earlyoom detects that it is running inside a container.
  • For cgroup v1, we can read memory limits from memory.limit_in_bytes and current usage from memory.usage_in_bytes.
  • For cgroup v2, we can use memory.max for the limit and memory.current for current usage.
  • Fallback to /proc/meminfo: If the cgroup paths are not available, retain the current behavior of reading from /proc/meminfo.

Steps to Reproduce:
Deploy a pod in Kubernetes with earlyoom running inside.

[centos@dev-server-anijhawan-4 ~]$ cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: almalinux8
spec:
  containers:
  - name: almalinux
    image: almalinux:8
    command: ["sleep", "3600"] # Keeps the pod running for 1 hour
    
[centos@dev-server-anijhawan-4 ~]$ kubectl apply -f pod.yaml 

[centos@dev-server-anijhawan-4 ~]$ kubectl exec -it almalinux8 /bin/bash

[root@almalinux8 /]# rpm -ivh https://kojipkgs.fedoraproject.org//packages/earlyoom/1.6.2/1.el8/x86_64/earlyoom-1.6.2-1.el8.x86_64.rpm
Retrieving https://kojipkgs.fedoraproject.org//packages/earlyoom/1.6.2/1.el8/x86_64/earlyoom-1.6.2-1.el8.x86_64.rpm
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:earlyoom-1.6.2-1.el8             ################################# [100%]
[root@almalinux8 /]# earlyoom 
earlyoom 1.6.2
mem total: 60278 MiB, swap total:    0 MiB
sending SIGTERM when mem <= 10.00% and swap <= 10.00%,
        SIGKILL when mem <=  5.00% and swap <=  5.00%
mem avail: 48602 of 60278 MiB (80.63%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48592 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)
mem avail: 48591 of 60278 MiB (80.61%), swap free:    0 of    0 MiB ( 0.00%)

I would be willing to contribute the said fix and upstream it as well.

@amannijhawan
Copy link
Author

bumping this up to see if the maintainers think this would be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant