Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added docker-multi-stage builds #10832

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rudiservo
Copy link

Added multi-stage dockerfile builds, improved total build time to under 2 hours, added Vulkan and Full-intel.

Updated rocm dockerfile.

Hopefully it will all work without any problems.

@rudiservo rudiservo requested a review from ngxson as a code owner December 14, 2024 20:39
@github-actions github-actions bot added the devops improvements to build systems and github actions label Dec 14, 2024
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also fix coding style as reported by editorconfig workflow

find build -name "*.so" -exec cp {} /app/lib \;


FROM ${BASE_CUDA_DEV_CONTAINER} AS full
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to switch the image ro runtime here, as we don't gonna build anything from this point on

Suggested change
FROM ${BASE_CUDA_DEV_CONTAINER} AS full
FROM ${BASE_CUDA_RUN_CONTAINER} AS full

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that is so, it is safe to not install cmake or any other build packages?
This will make the image and build time smaller by a tiny bit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the runtime doesn't need to have cmake and build-essentials

find build -name "*.so" -exec cp {} /app/lib \;


FROM ${BASE_MUSA_DEV_CONTAINER} AS full
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

uses: docker/build-push-action@v6
with:
context: .
push: true
platforms: ${{ matrix.config.platforms }}
# tag list is generated from step above
tags: ${{ steps.tag.outputs.output_tags }}
tags: full-${{ steps.tag.outputs.output_tags }}
Copy link
Collaborator

@ngxson ngxson Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the old full tag become full-cpu? If yes, this will be a breaking change for downstream projects, I don't think we should let that happen, because the full tag will not be deleted people using it in an automated way won't receive any updates

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue is that "full" only tag came without any specification, and we have many types of builds for given targets, the possible workaround will be a hack but I think it's doable.
We have the cuda, rocm, intel, vulkan but cpu has no spec, it was just full, when you do multi-stage with the current tag system it is a bit tricky, but I will give it a try.

Copy link
Collaborator

@ngxson ngxson Dec 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep the tag as-is, because many people depends on that and we currently don't have a way to communicate with them about breaking changes in tag name.

changing full, light and especially server to something else will make someone already using it never receive updates in the future. and worst, because there is no error or warning when they run it, they wouldn't know there is a breaking change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngxson the tags are as-is, just a little issue of caching layers, it's going to be a few more pushes, sorry.

@rudiservo rudiservo force-pushed the docker-multi-stage branch 10 times, most recently from bfdf494 to 72d0847 Compare December 15, 2024 02:31
@rudiservo
Copy link
Author

@ngxson I am really sorry about the spam, starting to use act.
Just a slight issue with caching, will report when fixed.

@rudiservo
Copy link
Author

rudiservo commented Dec 15, 2024

@ngxson ok figured out the cache, it will be using github cache, still a bit experimental but it works best at this point in time.

In the dockerfiles there is a

cp build/bin/* .

can this be changed to a mv? It would same some space.

Or I can just copy this from the cpu dockerfile

COPY --from=build /app/build/bin/ /app/
COPY --from=build /applib/ /app/
COPY --from=build /app/convert_hf_to_gguf.py /app/
COPY --from=build /app/gguf-py /app/gguf-py

instead of copying the full /app in the full versions.

libcurl4-openssl-dev \
libgomp1

COPY requirements.txt requirements.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
COPY requirements.txt requirements.txt
COPY requirements.txt .

echo "output_tags=$TAGS" >> $GITHUB_OUTPUT
echo "output_tags=$TAGS" # print out for debugging
if [[ "${{ matrix.config.tag }}" == "cpu" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, can we do this a bit more simple by using sed to remove -cpu from tag name? so we don't have to specify FULLTAGS, LIGHTTAGS, SERVERTAGS

Another approach (with less code) could be:

BASE_IMAGE="ghcr.io/${REPO_OWNER}/${REPO_NAME}"
FULLTAGS="${BASE_IMAGE}:full,${BASE_IMAGE}:full-cpu"
SERVERTAGS="${BASE_IMAGE}:server,${BASE_IMAGE}:server-cpu"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could try something with sed, like:

echo '${{ matrix.config.tag }}' | sed 's/-cpu//g'

It would remove the if statement, but in my opinion the if statement is more "readable" or expressive at first glance.

I can simplify this, still I would need at least 3 vars to make this work and keep the project tags has they are.
The PREFIX, POSTFIX and one extra TYPE for the "matrix tag"

if [[ "${{ matrix.config.tag }}" == "cpu" ]]; then
    TYPE=""
else
    TYPE="-${{ matrix.config.tag }}"
fi
PREFIX="ghcr.io/${REPO_OWNER}/${REPO_NAME}:"
POSTFIX="-${TAG_POSTFIX}"

And

echo "prefix=$PREFIX" >> $GITHUB_OUTPUT
echo "type=$TYPE" >> $GITHUB_OUTPUT
echo "postfix=$POSTFIX" >> $GITHUB_OUTPUT

with in each build-push-action for the there it would be

tags: ${{ steps.tag.outputs.prefix }}full${{ steps.tag.outputs.type }},${{ steps.tag.outputs.output_tags }}full${{ steps.tag.outputs.type }}${{ steps.tag.outputs.postfix }}

Or

FULLTAGS="${PREFIX}full${TYPE},${PREFIX}full${TYPE}${POSTFIX}"
LIGHTTAGS="${PREFIX}light${TYPE},${PREFIX}light${TYPEl${POSTFIX}"
SERVERTAGS="${PREFIX}server${TYPE},${PREFIX}server${TYPE}${POSTFIX}"
echo "full_output_tags=$FULLTAGS" >> $GITHUB_OUTPUT
echo "light_output_tags=$LIGHTTAGS" >> $GITHUB_OUTPUT
echo "server_output_tags=$SERVERTAGS" >> $GITHUB_OUTPUT

What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed new code, normalized all docker images, hopefully optimized docker layers.

@ngxson
Copy link
Collaborator

ngxson commented Dec 15, 2024

In the dockerfiles there is a

cp build/bin/* .

can this be changed to a mv? It would same some space.

Yes I think we can, I don't see any problem with this as we never run cmake after that (cc @slaren too, in case I'm missing something)

I'll test on cpu + CUDA in the next few days when I come back from vacation

@slaren
Copy link
Collaborator

slaren commented Dec 15, 2024

Yes, I don't see why the cp couldn't be replaced with a mv. Same for the find -exec cp that is used to copy the .so files. Although all of this should probably be replaced with cmake --install in the future.

@rudiservo
Copy link
Author

rudiservo commented Dec 15, 2024

@slaren agree, but I would be happy to know what do I need to copy for the full image to work instead of the complete /app folder just to try and keep these images has small has possible for now.

From what I can tell it's the python scripts, the .so and the build/bin.
I will make the appropriate changes if that is ok?

@slaren
Copy link
Collaborator

slaren commented Dec 16, 2024

I already tried to copy as little as possible for the full image in the full.Dockerfile image. The same should work for the other backend images.

@rudiservo rudiservo force-pushed the docker-multi-stage branch 8 times, most recently from 735b9b0 to bf1caab Compare December 17, 2024 00:06
@rudiservo
Copy link
Author

I normalized all docker images, improved the code a bit, I think everything is ok.
Total build time for 15 images is now at ~1h50.

@rudiservo
Copy link
Author

It's pushing multiple untaged images, checking.

@rudiservo
Copy link
Author

Fixed, added provenance to build-and-push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants