-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman compose not working correctly for some compose yamls #23114
Comments
i can at least give this a try. any chance, however, that this is an arch issue with the image? |
All of the images that are being pulled in this case are built as multi-arch images, meaning they have both See https://quay.io/repository/quarkus-super-heroes/rest-villains/manifest/sha256:a3d658e8f9935a98b893921152e5a5192ff2cc9187af0c915cad069943dd4eac for example |
For what it's worth I have the same exact issue. I have to manually pull images to make them whenever locally before running compose. Downloads started by compose crash the podman machine with that EOF error |
I haven't been able to reproduce using podman 5.1.1 on a mac and setting up/using podman-compose as described in the first post. |
From your symptoms it sounds like gvproxy is crashing/unresponsive. So the first step is to check if gvproxy is still running, likely not in your case. I was also unable to reproduce. I assume it could be related to network speeds, do you have a fast or slow internet connection? The one thing to debug is to start |
When I start it in debug with After its crashed, the vm is still "up", but it is completely unaccessible. I can't even |
Yeah that sounds like gvproxy crashing, because all our communication ssh/unix socket gets proxied over it. Without gvproxy there is no networking for the VM. If --log-level debug doesn't reproduce I wonder if the fact that it has to write so many log lines slows it down enough to no longer hit whatever race this is. 1.4 Gbps speed is certainly not something I can reproduce here. |
Whats interesting is i just did a |
╰─ podman pull quay.io/strimzi/kafka:0.34.0-kafka-3.4.0
Trying to pull quay.io/strimzi/kafka:0.34.0-kafka-3.4.0...
Getting image source signatures
Copying blob sha256:fd472bf0e5350a58938a790a3cca11cbc9110ba69623dd1e12885cf451db5ac6
Copying blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
Copying blob sha256:73eb70c13411156096e1118a9437ac965a5b617787d19f03ddd27a0de9d20ec6
Copying blob sha256:b2f02f84fc56d045320f9e28c38d8cab46e794fe3c6ab7d01a06ca2f51c24a83
Copying blob sha256:c19e3e0a2e6d2a52c9d05b1f8ae479c00fb0c5b34d812cffb9e16dbaac231ec9
Copying blob sha256:8f427bd5e9bc8b7e9cf027c4f9bea592194432a7102067cf31e08bf0fa22087e
Copying blob sha256:6b07c69f4ddc8550cb08df557afce56988010868b97f964112c474753174ca59
Copying blob sha256:c1ac0dbf18e571305a3ed4484e321c24a4f4c44f720b436d58b16ee9b5b2b28c
Copying blob sha256:504b772b0ac13bae123c50614ad3f0a2814720c17633a696ffd0cc6952805dc9
Copying blob sha256:b18ade3c1cf8cd0a0f7deb12459f439eedca8288ef126b7aa908cb38850d01d5
Copying blob sha256:919eb322bc96dd7f8eb603ce9f2ae063ecdbbb2291ffe7ccd05d82d2ac0e557f
Copying blob sha256:84a1c9b46d22e38dc7f2ee890e7ccdbc5816fd8d2cb2f59f55e6e0d2b449ecec
Copying blob sha256:ded5ff4d5559f92dbeb2ad47b070dc7577eaca5f9f883cdad7bd485a568804fd and then in another terminal... ╰─ podman machine ssh
Connecting to vm podman-machine-default. To close connection, use `~.` or `exit`
ssh: connect to host localhost port 50860: Connection refused |
Yeah I think this what #22284 talks about, I would assume they have the same root cause just slightly different symptoms maybe. |
For whatever reason it seems podman does not like that kafka container (or any other kind of kafka container)... if you
I see testcontainers getting hosed due to podman crashing while trying to pull the
|
The EOF happened to me with a plain nodejs image |
Unfortunately until this issue is resolved I need to uninstall podman completely from my machine and install Docker Desktop instead. This issue prevents me from doing my day-to-day job. |
@riccardo-forina can you describe the exact steps you took to get the EOF issue? I want to try it and see if I see the same. |
Basically the machine is now unreachable, although reported as running. I have to manually stop and start it again to get it up again. I have a fast connection as well (1Gbit). |
Thanks @riccardo-forina I am actually able to
This is exactly what happens when it happens to me.
I've found that even trying to |
@edeandrea I am unable to reproduce this problem
podman version: 5.0.3 I do have a bigger podman machine though: platform: MacOS - m3 |
I'm using the same size - 6 cpus, 8g memory, 100G disk. I'm using podman 5.1.1. |
Just upgraded and recreated the machine with the following version info. Same result.
|
Tested again today with a m1 machine and a wired connection (gives me 800Mbps on speedtest). I tried the 2 Can anyone check |
|
After further debugging I found that setting the cpus count for the machine to anything above 2, causes the problem. With 1 or 2, all works fine. I'm on a M1, I should have 8 cores available
|
Debugged this with @cfergeau, we used a debug version of the gvproxy and managed to capture the error in a video, and collect some logs Kapture.2024-06-28.at.10.48.45.mp4
It appears to be linked to the network speed. If it's fast enough, the network buffer can get saturated and cause the crash. Having more than one cpu assigned to the machine exacerbates the problem. |
The issue is indeed
I was seeing this when I added vfkit support to gvisor-tap-vsock until I added https://github.com/containers/gvisor-tap-vsock/blob/6dbbe087eb62775e99abc69ac232d13f74cac73a/pkg/transport/unixgram_darwin.go#L24-L30 I've filed this in gvisor-tap-vsock: containers/gvisor-tap-vsock#367 |
Thank you both @riccardo-forina and @cfergeau for troubleshooting! FWIW...
|
I never thought I'd hear someone say
|
Added to our sprint for review/investigation: https://github.com/orgs/crc-org/projects/1?pane=issue&itemId=69134292 |
This contains a fix for a gvproxy crash on macos on fast connections with heavy network load. This should fix containers#23114 Signed-off-by: Christophe Fergeau <[email protected]>
This contains a fix for a gvproxy crash on macos on fast connections with heavy network load. This should fix containers#23114 Signed-off-by: Christophe Fergeau <[email protected]>
Issue Description
I've installed
podman compose
according to the instructions at https://podman-desktop.io/docs/compose/setting-up-compose.When I try to run
podman compose up
for certain compose yaml files, it errors out. If I do apodman pull
of all of the images in the compose file one at a time, then thepodman compose up
seems to work.Even doing
podman compose pull
seems to blow up.Steps to reproduce the issue
Steps to reproduce the issue
cd quarkus-super-heroes
podman compose -f deploy/docker-compose/java17.yml pull
Describe the results you received
Furthermore, after this happens the podman machine is totally hosed. It is still running but is completely unresponsive. It has to be restarted before it is usable again.
Describe the results you expected
I expect it to work.
podman info output
Podman in a container
No
Privileged Or Rootless
Privileged
Upstream Latest Release
Yes
Additional environment details
MacOS arm architecture
Additional information
No response
The text was updated successfully, but these errors were encountered: