Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with new version (Alpine 3.21) #993

Open
AdriiiPRodri opened this issue Dec 10, 2024 · 13 comments
Open

Segmentation fault with new version (Alpine 3.21) #993

AdriiiPRodri opened this issue Dec 10, 2024 · 13 comments

Comments

@AdriiiPRodri
Copy link

Hi all,

I am recently having problems with the Python 3.12-alpine image (updated 6 hours ago since the creation of this issue). I can't get any information about the error.

The version python:3.12.8-alpine3.19 is working correctly, with the version python 3.12-alpine (latest):

2024-12-10 13:57:06 Segmentation fault
2024-12-10 13:57:08 Segmentation fault

I haven't had time to investigate how to get logs of this problem, if you need them let me know.

Regards

@shadycuz
Copy link

Also came here to report the same thing.

@yosifkit
Copy link
Member

We'll need a little more information to reproduce and debug. I tried building the prowler-cloud/prowler Dockerfile swapped to Alpine 3.21 and didn't get any errors, so I am unsure where the issue is.

Perhaps there is a specific pip-installed library that isn't ready for Alpine 3.21, but only fails when used and not on install?

@torkashvand
Copy link

I am having the same issue and this is my Dockerfile

FROM python:3.12.8-alpine
WORKDIR /app


RUN apk add --no-cache gcc libc-dev libffi-dev curl vim && \
    addgroup -S appgroup && adduser -S appuser -G appgroup -h /app


# Copy the shell scripts and ensure scripts do not have Windows line endings and make them executable
COPY start-app.sh start-worker.sh start-scheduler.sh /app/
RUN sed -i 's/\r$//' start-app.sh start-worker.sh start-scheduler.sh && \
    chmod 755 start-app.sh start-worker.sh start-scheduler.sh

RUN chown -R appuser:appgroup /app
USER appuser
EXPOSE 8080
ENTRYPOINT ["/app/start-app.sh"]

and I am getting segmentation fault error!
I had to pin my python to something old like FROM python:3.12.7-alpine

@LaurentGoderre
Copy link
Member

Can you try rebuilding without cache? I'm wondering if it compiled some native dependencies that don't work with the new version.

@tianon
Copy link
Member

tianon commented Dec 11, 2024

I tried pip install on a whole bunch of random modules, tried loading a few, and still can't reproduce a segfault. 😅

Having a simple and reliable reproducer (that we can use to reproduce with) is going to be necessary if we're going to get any further here. 🙇

@tianon
Copy link
Member

tianon commented Dec 11, 2024

Seeing prowler-cloud/prowler#6109 in the references here, I hoped Prowler would give me something that segfaults, but no dice -- it installs and works just fine via pip install https://github.com/prowler-cloud/prowler/archive/HEAD.tar.gz

$ prowler dashboard
                         _
 _ __  _ __ _____      _| | ___ _ __
| '_ \| '__/ _ \ \ /\ / / |/ _ \ '__|
| |_) | | | (_) \ V  V /| |  __/ |
| .__/|_|  \___/ \_/\_/ |_|\___|_|v5.1.0
|_| the handy multi-cloud security tool

Date: 2024-12-11 20:40:31

Loading all CSV files from the folder /tmp/output ...

Dash is running on http://127.0.0.1:11666/

NOTE: If you are using Prowler SaaS with the S3 integration or that integration 
from Prowler Open Source and you want to use your data from your S3 bucket,
run: `aws s3 cp s3://<your-bucket>/output/csv ./output --recursive`
and then run `prowler dashboard` again to load the new files.

(I have no idea what Prowler is or how to use it; I was just hoping for something that can reproduce the segfault.)

@tianon
Copy link
Member

tianon commented Dec 11, 2024

I was able to reproduce with FlexGet, though (inspired by the backlink from Flexget/Flexget#4085) -- pip install flexget (after apk add --no-cache cargo linux-headers) and then running flexget segfaults.

However, I was also able to reproduce the same behavior with alpine:3.21 and Alpine's Python, so it's not something we do that's causing this. 🙇

@yosifkit
Copy link
Member

At least for flexget, it looks related to the .so provided by the pendulum library (#9 in the backtrace):

/config # gdb python
...
(gdb) r /usr/local/bin/flexget
Starting program: /usr/local/bin/python /usr/local/bin/flexget
warning: Error disabling address space randomization: Operation not permitted

Program received signal SIGSEGV, Segmentation fault.
0x00007f5db2f1fd9f in memchr (src=src@entry=0x2, c=c@entry=0, n=n@entry=2147483647) at src/string/memchr.c:16
warning: 16     src/string/memchr.c: No such file or directory
(gdb) bt
#0  0x00007f5db2f1fd9f in memchr (src=src@entry=0x2, c=c@entry=0, n=n@entry=2147483647) at src/string/memchr.c:16
#1  0x00007f5db2f61736 in strnlen (s=s@entry=0x2 <error: Cannot access memory at address 0x2>, n=n@entry=2147483647) at src/string/strnlen.c:5
#2  0x00007f5db2f1eabd in printf_core (f=f@entry=0x7ffdcec4bae0, fmt=fmt@entry=0x7f5db2f9b4c8 "Error relocating %s: %s: initial-exec TLS resolves to dynamic definition in %s",
    ap=ap@entry=0x7ffdcec4b948, nl_arg=nl_arg@entry=0x7ffdcec4b9e0, nl_type=nl_type@entry=0x7ffdcec4b960) at src/stdio/vfprintf.c:600
#3  0x00007f5db2f60100 in vfprintf (f=f@entry=0x7ffdcec4bae0, fmt=fmt@entry=0x7f5db2f9b4c8 "Error relocating %s: %s: initial-exec TLS resolves to dynamic definition in %s",
    ap=<optimized out>) at src/stdio/vfprintf.c:690
#4  0x00007f5db2f605af in vsnprintf (s=s@entry=0x0, n=n@entry=0, fmt=fmt@entry=0x7f5db2f9b4c8 "Error relocating %s: %s: initial-exec TLS resolves to dynamic definition in %s",
    ap=ap@entry=0x7ffdcec4bbe8) at src/stdio/vsnprintf.c:49
#5  0x00007f5db2f3f621 in __dl_vseterr (fmt=0x7f5db2f9b4c8 "Error relocating %s: %s: initial-exec TLS resolves to dynamic definition in %s", ap=ap@entry=0x7ffdcec4bc38)
    at src/ldso/dlerror.c:62
#6  0x00007f5db2f67b78 in error_impl (fmt=<optimized out>) at ldso/dynlink.c:2437
#7  0x00007f5db2f34157 in do_relocs (dso=dso@entry=0x7f5daed94670, rel=0x7f5daeac7ea0, rel_size=<optimized out>, stride=stride@entry=3) at ldso/dynlink.c:463
#8  0x00007f5db2f34aef in reloc_all (p=0x7f5daed94670) at ldso/dynlink.c:1423
#9  0x00007f5db2f21e61 in dlopen (file=0x7f5daedfef50 "/usr/local/lib/python3.11/site-packages/pendulum/_pendulum.cpython-311-x86_64-linux-musl.so", mode=2) at ldso/dynlink.c:2198
#10 0x00007f5db2b8cf86 in ?? () from /usr/local/bin/../lib/libpython3.11.so.1.0
...(a bunch more in libpython3.11.so.1.0)
#119 0x00007f5db2c7b0f6 in Py_BytesMain () from /usr/local/bin/../lib/libpython3.11.so.1.0
#120 0x00007f5db2f3e496 in libc_start_main_stage2 (main=0x55d16e46b040, argc=2, argv=0x7ffdcec50c38) at src/env/__libc_start_main.c:95
#121 0x000055d16e46b05b in _start ()

@tianon
Copy link
Member

tianon commented Dec 11, 2024

Nice, I can reproduce with just pendulum:

FROM alpine:3.21

RUN apk add --no-cache py3-pip cargo

RUN python3 -m venv /tmp

RUN set -eu; \
	. /tmp/bin/activate; \
	set -x; \
	pip install 'pendulum == 3.0.0'

RUN set -eu; \
	. /tmp/bin/activate; \
	set -x; \
	python -c 'import pendulum'
$ docker build --pull .
Sending build context to Docker daemon  2.048kB
Step 1/5 : FROM alpine:3.21
3.21: Pulling from library/alpine
Digest: sha256:21dc6063fd678b478f57c0e13f47560d0ea4eeba26dfc947b2a4f81f686b9f45
Status: Image is up to date for alpine:3.21
 ---> 4048db5d3672
Step 2/5 : RUN apk add --no-cache py3-pip cargo
 ---> Using cache
 ---> 52d8d8874713
Step 3/5 : RUN python3 -m venv /tmp
 ---> Using cache
 ---> 4b59a39b22b6
Step 4/5 : RUN set -eu; 	. /tmp/bin/activate; 	set -x; 	pip install 'pendulum == 3.0.0'
 ---> Running in 05d9ee1afcac
+ pip install 'pendulum == 3.0.0'
Collecting pendulum==3.0.0
  Downloading pendulum-3.0.0-cp312-cp312-musllinux_1_1_x86_64.whl.metadata (6.9 kB)
Collecting python-dateutil>=2.6 (from pendulum==3.0.0)
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting tzdata>=2020.1 (from pendulum==3.0.0)
  Downloading tzdata-2024.2-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting time-machine>=2.6.0 (from pendulum==3.0.0)
  Downloading time_machine-2.16.0-cp312-cp312-musllinux_1_2_x86_64.whl.metadata (21 kB)
Collecting six>=1.5 (from python-dateutil>=2.6->pendulum==3.0.0)
  Downloading six-1.17.0-py2.py3-none-any.whl.metadata (1.7 kB)
Downloading pendulum-3.0.0-cp312-cp312-musllinux_1_1_x86_64.whl (558 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 558.2/558.2 kB 18.0 MB/s eta 0:00:00
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Downloading time_machine-2.16.0-cp312-cp312-musllinux_1_2_x86_64.whl (32 kB)
Downloading tzdata-2024.2-py2.py3-none-any.whl (346 kB)
Downloading six-1.17.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: tzdata, six, python-dateutil, time-machine, pendulum
Successfully installed pendulum-3.0.0 python-dateutil-2.9.0.post0 six-1.17.0 time-machine-2.16.0 tzdata-2024.2
Removing intermediate container 05d9ee1afcac
 ---> 2d380f86af37
Step 5/5 : RUN set -eu; 	. /tmp/bin/activate; 	set -x; 	python -c 'import pendulum'
 ---> Running in 07fc56f07310
+ python -c 'import pendulum'
The command '/bin/sh -c set -eu; 	. /tmp/bin/activate; 	set -x; 	python -c 'import pendulum'' returned a non-zero code: 139

So, looks like filing an issue at https://github.com/python-pendulum/pendulum is probably appropriate. 👍

@tianon
Copy link
Member

tianon commented Dec 11, 2024

It's not exactly the same, but golang/go#54805 & golang/go#13492 are very related -- looking at the backtrace, the loader is segfaulting while trying to print an error message (Error relocating %s: %s: initial-exec TLS resolves to dynamic definition in %s).

@AdriiiPRodri
Copy link
Author

AdriiiPRodri commented Dec 13, 2024

Thanks for the investigation, in our case Go and Pendulum aren't involved. You can reproduce this error with Prowler by deploying the development environment from the following commit:

https://github.com/prowler-cloud/prowler/tree/ecfd94aeb14d29586274105355e2588cf48abff0

docker compose -f docker-compose-dev.yml up

After the services start, you should see logs similar to the following:

postgres-1     | 2024-12-13 15:05:55.502 UTC [27] LOG:  checkpoint complete ...
postgres-1     | 2024-12-13 15:05:55.505 UTC [1] LOG:  database system is ready ...
ui-dev-1       |  ✓ Ready in 1141ms
api-dev-1      | Applying database migrations...
worker-dev-1   | Starting the worker...
worker-beat-1  | Starting the worker-beat...
api-dev-1      | Applying Django fixtures...
api-dev-1      | Segmentation fault
api-dev-1      | Loading api/fixtures/dev/0_dev_users.json
worker-dev-1   | Segmentation fault
worker-dev-1 exited with code 139
api-dev-1      | Segmentation fault
api-dev-1      | Loading api/fixtures/dev/1_dev_tenants.json
api-dev-1      | Segmentation fault
api-dev-1      | Loading api/fixtures/dev/2_dev_providers.json

If you new more info let me know.

@tianon
Copy link
Member

tianon commented Dec 13, 2024

Looks like one (or more?) of your dependencies pulls in pendulum:

https://github.com/prowler-cloud/prowler/blob/ecfd94aeb14d29586274105355e2588cf48abff0/api/poetry.lock#L3098-L3188

🙈

@AdriiiPRodri
Copy link
Author

You are right, I didn't know we had this dependency, it comes from another package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants