Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless Docker/Optimized build #932

Merged
merged 8 commits into from
Jan 23, 2025

Conversation

PseudoResonance
Copy link
Contributor

@PseudoResonance PseudoResonance commented Jan 19, 2025

This is a replacement for #894

Changes

  • Add unneeded files to .dockerignore (I chose to leave security/license in the image, as those could be useful if someone inspects it)
  • In addition to feat(docker): copy PHP extensions from builder stage to speedup the b… #918, split Dockerfile into more stages to allow Composer/Yarn to install concurrently
    Ex: The stages 2-1, 2-2 can run independently of each other, and so can 3-1 and 3-2.
  • Don't log supervisord to a file, as file logging in a Docker container makes no sense
  • Redirect process output to supervisord/container output for log processors
    It's generally important to not simply discard all logs. Let the Docker host handle logfile rotation if using file logging, or let the host redirect to their desired log processor, such as Grafana Loki.
  • Run all processes as non-root
  • Replace cron with supercronic, as it's designed to easily run as non-root and redirect output wherever you want it
  • Minimize files with write permission for non-root user
    The majority of the build should ideally be left as root, so that in the event there is a vulnerability, the user would be unable to easily overwrite the working code. I don't think there's a good solution to the Laravel cache however.
  • Move docker folder out of .github, as it has nothing to do with GitHub
  • Use php-extension-installer wrapper that handles automatically adding build dependencies and removing dev dependencies after building to simplify package list.

Additional Information

As in #894, I separated the install/build stages for Composer/Yarn so that the install stage can be skipped from cache if the code is unchanged. I know there were questions about it in the last PR, but this is standard practice in Dockerfiles. First copy the list of dependencies to install and install them, then copy the remainder of the code.

After these changes, I was able to iterate on the build process and rebuild the docker image in mere seconds, compared to the several minutes it took previously, as the whole image had to be rebuilt every time because the code was copied early on.

Important Note

I am using COPY --exclude=Caddyfile --exclude=docker/ . ./ to copy files, which is a new addition to the Dockerfile copy directive which is currently only present in the 1.7 labs spec of Dockerfile. It's been around for a year now and I don't expect this basic use of it to break. This does require declaring the 1.7-labs spec at the top of the file though, which should eventually be removed when exclude is merged to stable.

Other Options

Instead of using supercronic and supervisord and everything to run as non-root, sidecar containers can also be added. This is how some other projects, like Nextcloud, deal with running cron tasks, and running a more configurable HTTP server.

Add unneeded files to .dockerignore
Split Dockerfile into more stages to allow Composer/Yarn to run concurrently
Don't log supervisord to a file, as file logging in a Docker container makes no sense
Redirect process output to container output for log processors
Run all processes as non-root
Minimize files with write permission for non-root user
Move docker folder out of .github, as it has nothing to do with GitHub
@PseudoResonance
Copy link
Contributor Author

This additionally resolves the issue mentioned in #894, where the entrypoint breaks permissions, as everything is run as www-data now.

@PseudoResonance
Copy link
Contributor Author

PseudoResonance commented Jan 19, 2025

I just tried rerunning this build on GitHub Actions, and while it's not a very scientific test, it saved quite some time. I think mostly due to the parallelization.

Original This PR
amd64 4:00 3:04
arm64 6:38 3:11

It appears there is a bug with Docker Buildx GitHub Actions caching when using multiple OS runners... So the caching actually only helps on one of the 2 images at random...

ed1473f fixes this by specifying separate scopes for each OS.

As an example, first run used 0% cache, 2nd run with 0 changes finished in seconds, and a code change. ran in a bit over a minute, because dependency stages were cached.

.github/workflows/docker-publish.yml Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
@QuintenQVD0
Copy link
Contributor

I can confirm the build time is faster.

I can confirm it does migrate the db
The slow arm64 builder will be fixt by GitHub as there are some known issues with it.

This is it build from inside the container https://paste.pelistuff.com/YAA6pK8PXW7HbKMsrfiyCGyF

@PseudoResonance
Copy link
Contributor Author

PseudoResonance commented Jan 19, 2025

I can confirm it does migrate the db The slow arm64 builder will be fixt by GitHub as there are some known issues with it.

I'm not really sure why the arm64 builds were so slow on GitHub prior to what I changed. I don't know if anything I changed affected it in particular, but I have also heard that the runners might be buggy still.

Also I think I got all the permissions correct, but if anyone knows there's some other place that needs write access, that should be added too. As far as I know, only database, storage and bootstrap/cache get written to?

@PseudoResonance
Copy link
Contributor Author

Here are the current permissions after the latest commit.

/var/www/html
drwxr-x---    1 root     www-data      4096 Jan 20 13:48 .
drwxr-xr-x    1 root     root          4096 Jan 17 01:27 ..
lrwxrwxrwx    1 root     root            18 Jan 20 13:48 .env -> /pelican-data/.env
-rw-r-----    1 root     www-data       121 Jan  7 05:11 .env.example
drwxr-x---    1 root     www-data      4096 Jan 19 08:03 app
-rw-r-----    1 root     www-data       350 Jan  7 05:11 artisan
drwxr-x---    1 root     www-data      4096 Jan  8 13:06 bootstrap
-rw-r-----    1 root     www-data      3332 Jan 19 05:29 composer.json
-rw-r-----    1 root     www-data    516684 Jan 19 05:29 composer.lock
drwxr-x---    1 root     www-data      4096 Jan 19 08:03 config
-rw-r-----    1 root     www-data       495 Jan  7 05:11 crowdin.yml
drwxr-x---    1 root     www-data      4096 Jan 20 13:48 database
drwxr-xr-x    2 root     root          4096 Jan 20 13:48 docker
drwxr-x---    1 root     www-data      4096 Jan  7 05:11 lang
-rw-r-----    1 root     www-data     34524 Jan  7 05:11 license
-rw-r-----    1 root     www-data       532 Jan 19 10:10 package.json
-rw-r-----    1 root     www-data       227 Jan  7 05:11 pint.json
-rw-r-----    1 root     www-data       143 Jan  8 13:06 postcss.config.js
drwxr-x---    1 root     www-data      4096 Jan 19 08:03 public
drwxr-x---    1 root     www-data      4096 Jan  8 13:32 resources
drwxr-x---    1 root     www-data      4096 Jan 19 08:03 routes
-rw-r-----    1 root     www-data       392 Jan  7 05:11 security.md
drwxrwx---    1 www-data www-data      4096 Jan  7 05:11 storage
-rw-r-----    1 root     www-data       400 Jan  8 13:06 tailwind.config.js
drwxr-x---    1 root     www-data      4096 Jan  7 05:11 tests
drwxr-x---    1 root     www-data      4096 Jan 20 13:48 vendor
-rw-r-----    1 root     www-data       359 Jan  8 13:06 vite.config.js
-rw-r-----    1 root     www-data     42996 Jan 19 10:10 yarn.lock
/var/www/html/bootstrap
drwxr-x---    1 root     www-data      4096 Jan  8 13:06 .
drwxr-x---    1 root     www-data      4096 Jan 20 13:48 ..
-rw-r-----    1 root     www-data      2293 Jan  8 13:06 app.php
drwxrwx---    1 www-data www-data      4096 Jan 21 05:54 cache
-rw-r-----    1 root     www-data       476 Jan  8 13:06 providers.php
-rw-r-----    1 root     www-data      1442 Jan  7 05:11 tests.php
/pelican-data
drwxrwx---    4 www-data www-data      4096 Jan 21 05:54 .
drwxr-xr-x    1 root     root          4096 Jan 21 05:54 ..
-rw-r--r--    1 www-data www-data        61 Jan 21 05:54 .env
drwx------    5 www-data www-data      4096 Jan 21 05:54 caddy
drwxr-xr-x    2 www-data www-data      4096 Jan 21 06:01 database

docker/README.md Outdated Show resolved Hide resolved
compose.yml Show resolved Hide resolved
@@ -39,5 +35,12 @@ command=caddy run --config /etc/caddy/Caddyfile --adapter caddyfile
autostart=%(ENV_SUPERVISORD_CADDY)s
autorestart=%(ENV_SUPERVISORD_CADDY)s
priority=10
stdout_events_enabled=true
stderr_events_enabled=true
stdout_logfile=/dev/fd/1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point in switching to redirect_stderr instad of the stderr_events_enabled and why did you disable the stdout_events_enabled?

Copy link
Contributor Author

@PseudoResonance PseudoResonance Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no event listener configured, so stdout_events_enabled and stderr_events_enabled are pointless.

Also sorry, I had to check and confirm it to be sure and forgot to mention. The redirect options redirect logs to stdout/stderr as expected of the name, but the events redirect it to special event listeners that can ex: send emails on errors.

docker/entrypoint.sh Show resolved Hide resolved
docker/entrypoint.sh Show resolved Hide resolved
docker/supervisord.conf Show resolved Hide resolved
docker/supervisord.conf Show resolved Hide resolved
.dockerignore Show resolved Hide resolved
@QuintenQVD0
Copy link
Contributor

You have my go

afbeelding
afbeelding
afbeelding

@alexevladgabriel
Copy link
Member

@PseudoResonance, Thank you for the amazing work you have done. I have been watching over the days and am happy to see the community's contributions.

@alexevladgabriel alexevladgabriel merged commit 6a49632 into pelican-dev:main Jan 23, 2025
15 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 23, 2025
@PseudoResonance PseudoResonance deleted the rootless-docker branch January 23, 2025 11:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants