Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularly get "Disk quota exceeded: OCI runtime error" when running a container #23784

Closed
arthurbarr opened this issue Aug 28, 2024 · 8 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine remote Problem is in podman-remote

Comments

@arthurbarr
Copy link

Issue Description

Podman on macOS generally works really well, but every day or so, it stops being able to create new containers, and I get the following error:

$ podman run --rm -ti registry.access.redhat.com/ubi9/ubi bash
Error: preparing container 1e42b038e9ac2d1544a2facd7285ef947c00bf102469f1270e628d6472a7b3a3 for attach: crun: create keyring `1e42b038e9ac2d1544a2facd7285ef947c00bf102469f1270e628d6472a7b3a3`: Disk quota exceeded: OCI runtime error

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman run --rm -ti registry.access.redhat.com/ubi9/ubi bash

Describe the results you received

Error: preparing container 1e42b038e9ac2d1544a2facd7285ef947c00bf102469f1270e628d6472a7b3a3 for attach: crun: create keyring 1e42b038e9ac2d1544a2facd7285ef947c00bf102469f1270e628d6472a7b3a3: Disk quota exceeded: OCI runtime error

Describe the results you expected

No error

podman info output

host:
  arch: arm64
  buildahVersion: 1.37.2
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 97.99
    systemPercent: 1.05
    userPercent: 0.96
  cpus: 5
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: coreos
    version: "40"
  eventLogger: journald
  freeLocks: 2044
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.9.12-200.fc40.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 2446135296
  memTotal: 5753315328
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.1-1.20240819115418474394.main.6.gc2cd0be.fc40.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.13.0-dev
    package: netavark-1.12.1-1.20240819170533312370.main.26.g4358fd3.fc40.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.13.0-dev
  ociRuntime:
    name: crun
    package: crun-1.16-1.20240813143753154884.main.16.g26c7687.fc40.aarch64
    path: /usr/bin/crun
    version: |-
      crun version UNKNOWN
      commit: 158b340ec38e187abee05cbf3f27b40be2b564d0
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240726.g57a21d2-1.fc40.aarch64
    version: |
      pasta 0^20240726.g57a21d2-1.fc40.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 0
  swapTotal: 0
  uptime: 8h 31m 52.00s (Approximately 0.33 days)
  variant: v8
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 2
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 29987868672
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 546
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1724198400
  BuiltTime: Wed Aug 21 01:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.6
  Os: linux
  OsArch: linux/arm64
  Version: 5.2.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

I have tried adding the following config, but it doesn't seem to fix the issue.

[containers]
keyring=false

Additional information

Works fine for a day or so, then fails. Has happened on the last few fix packs for Podman 5.2.

Increasing the /proc/sys/kernel/keys/maxkeys setting as suggested in #13363 gives temporary respite, but quickly fills up again. I'm not running hundreds of containers a day, so not sure what's creating all these keys.

@arthurbarr arthurbarr added the kind/bug Categorizes issue or PR as related to a bug. label Aug 28, 2024
@github-actions github-actions bot added the remote Problem is in podman-remote label Aug 28, 2024
@Luap99 Luap99 added the machine label Aug 28, 2024
@Luap99
Copy link
Member

Luap99 commented Aug 28, 2024

Can you check cat /proc/keys before and after you start a container and then again after stopping the container.
By default each container will create a new keyring but on stop it should cleanup the keyring, I wonder they get leaked instead.

@arthurbarr
Copy link
Author

After restarting my Podman Machine:

$ podman ps -a
CONTAINER ID  IMAGE                                                                                                  COMMAND               CREATED       STATUS                        PORTS                                   NAMES
2364df6a6acd  localhost/foo:latest  -c echo Container...  7 days ago    Exited (0) About an hour ago                                          dev-foo
77ddc410a839  localhost/bar:latest  -c echo Container...  22 hours ago  Exited (0) About an hour ago  9443/tcp  dev-bar

$ podman machine ssh cat /proc/keys        
02067481 I--Q---     5 perm 3f030000   501  1000 keyring   _ses: 1
06c39f4e I--Q---    22 perm 3f030000   501  1000 keyring   _ses: 1
1a8b38fe I--Q---     1 perm 1f3f0000   501 65534 keyring   _uid_ses.501: 1
2992107f I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
2a00be72 I--Q---    19 perm 3f030000   501  1000 keyring   _ses: 1
30be9364 I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
31b01f4e I--Q---     7 perm 1f3f0000   501 65534 keyring   _uid.501: empty

$ podman machine ssh cat /proc/keys | wc -l              
       7

$ podman run --rm -ti registry.access.redhat.com/ubi9/ubi
[root@3030cd16d650 /]# exit
exit

 $ podman machine ssh cat /proc/keys                      
02067481 I--Q---     5 perm 3f030000   501  1000 keyring   _ses: 1
06c39f4e I--Q---    22 perm 3f030000   501  1000 keyring   _ses: 1
136681dc I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
1a8b38fe I--Q---     1 perm 1f3f0000   501 65534 keyring   _uid_ses.501: 1
23b3f718 I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
26a8d2bd I--Q---    20 perm 3f030000   501  1000 keyring   _ses: 1
2992107f I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
30be9364 I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1
31b01f4e I--Q---    10 perm 1f3f0000   501 65534 keyring   _uid.501: empty
3e6af144 I--Q---     2 perm 3f030000   501  1000 keyring   _ses: 1

$ podman machine ssh cat /proc/keys | wc -l              
       9

$ podman ps -a
CONTAINER ID  IMAGE                                                                                                  COMMAND               CREATED       STATUS                        PORTS                                   NAMES
2364df6a6acd  localhost/foo:latest  -c echo Container...  7 days ago    Exited (0) About an hour ago                                          dev-foo
77ddc410a839  localhost/bar:latest  -c echo Container...  22 hours ago  Exited (0) About an hour ago  9443/tcp  dev-bar

@ajram23
Copy link

ajram23 commented Aug 28, 2024

Same issue here.

@odockal
Copy link

odockal commented Aug 28, 2024

Some users faced the problem already and reported it on Podman-desktop repo.

@ajram23
Copy link

ajram23 commented Aug 28, 2024

For now doing this took care of it. "podman machine ssh
sudo sysctl -w kernel.keys.maxkeys=20000" I don't get the reason why it keeps track of the count. Especially when the images are deleted.

@Luap99
Copy link
Member

Luap99 commented Aug 28, 2024

I took a closer look, it seems to be somehow related to podman-remote I think, wehn I ssh into the mahcine and run the podman command there it only creates one keyring and on exit removes again which is expected.
However when the container is created via podman-remote (podman is always the remote client on macos/windows) they is a second keyring created on start which is not removed.

And even worse it is not related to containers at all, just running podman-remote ps already leaks a keyring entry, so it must be something with the ssh connection we are using not with the actual commands.

@Luap99
Copy link
Member

Luap99 commented Aug 28, 2024

The problem is all the sshd server processes are leaked on the server because the connections are not properly closed, this is due a gvproxy bug: #23616.

Fortunately that one has been already fixed so we "only" need a new release with new installers build that include the new gvproxy version 0.7.5, so for the time being you could manually replace the gvproxy binary with a good one and podman machine stop && podman machine start which should make it work.

@arthurbarr
Copy link
Author

I have just updated to gvproxy 0.7.5 on the macOS host, and that seems to have fixed the issue for me. Thank you @Luap99

@rhatdan rhatdan closed this as completed Aug 29, 2024
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Nov 28, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Nov 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine remote Problem is in podman-remote
Projects
None yet
Development

No branches or pull requests

5 participants