Skip to content

Commit

Permalink
ci: seperate distil-large-v3-en Docker image build
Browse files Browse the repository at this point in the history
- Add a new GitHub Actions job to build and push the `distil-large-v3-en` Docker image.
- Remove the `distil-large-v3` model from the build matrix.
- Rename and update the test job for the large-v3-zh Docker image.
- Update the README to reflect changes in image tags and mention the addition of the `distil-large-v3-en` model.

Signed-off-by: CHEN, CHUN <[email protected]>
  • Loading branch information
jim60105 committed Jan 7, 2025
1 parent f6662ed commit 53c57b0
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 6 deletions.
59 changes: 54 additions & 5 deletions .github/workflows/docker_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,56 @@ jobs:
sbom: true
provenance: true

# Build the distil-large-v3-en model (distil model seems to support only English)
docker-distil-large-v3-en:
# The type of runner that the job will run on
runs-on: ubuntu-latest
needs:
- docker-no_model # wait for docker-no_model to finish
- docker-cache # wait for docker-cache to finish

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: true

- name: Setup docker
id: setup
uses: ./.github/workflows/docker-reused-steps
with:
tag: distil-large-v3-en

- name: Get short SHA
id: get-sha
run: |
id=$(echo ${{ github.sha }} | cut -c 1-7)
echo "id=$id" >> $GITHUB_OUTPUT
- name: Build and push:distil-large-v3-en
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile
target: final
push: true
tags: ${{ steps.setup.outputs.tags }}
labels: ${{ steps.setup.outputs.labels }}
build-args: |
WHISPER_MODEL=distil-large-v3
LANG=en
LOAD_WHISPER_STAGE=ghcr.io/jim60105/whisperx:cache-distil-large-v3-${{ steps.get-sha.outputs.id }}
NO_MODEL_STAGE=ghcr.io/jim60105/whisperx:no_model@${{ needs.docker-no_model.outputs.digest }}
VERSION=${{ github.ref_name }}
RELEASE=${{ github.run_number }}
platforms: linux/amd64, linux/arm64
# Cache to registry instead of gha to avoid the capacity limit.
cache-from: type=registry,ref=ghcr.io/${{ github.repository_owner }}/whisperx:cache
cache-to: type=registry,ref=ghcr.io/${{ github.repository_owner }}/whisperx:cache,mode=max
sbom: true
provenance: true

# Run the rest of the builds in parallel
docker:
# The type of runner that the job will run on
Expand Down Expand Up @@ -256,7 +306,6 @@ jobs:
- small
- medium
- large-v3
- distil-large-v3
needs:
- docker-no_model # wait for docker-no_model to finish
- docker-cache # wait for docker-cache to finish
Expand Down Expand Up @@ -303,8 +352,8 @@ jobs:
sbom: true
provenance: true

test-distil-large-v3-zh:
name: Test distil-large-v3-zh docker image
test-large-v3-zh:
name: Test large-v3-zh docker image
runs-on: ubuntu-latest
needs: docker
steps:
Expand Down Expand Up @@ -333,9 +382,9 @@ jobs:
id=$(echo ${{ github.sha }} | cut -c 1-7)
echo "id=$id" >> $GITHUB_OUTPUT
- name: Test distil-large-v3-zh docker image
- name: Test large-v3-zh docker image
run: |
docker run --group-add 0 -v ".:/app" ghcr.io/jim60105/whisperx:distil-large-v3-zh-${{ steps.get-sha.outputs.id }} -- --device cpu --compute_type int8 --output_format srt .github/workflows/test/zh.webm;
docker run --group-add 0 -v ".:/app" ghcr.io/jim60105/whisperx:large-v3-zh-${{ steps.get-sha.outputs.id }} -- --device cpu --compute_type int8 --output_format srt .github/workflows/test/zh.webm;
if [ ! -f zh.srt ]; then
echo "The zh.srt file does not exist"
exit 1
Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,16 @@ docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:large-v3-ja -- -
docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:no_model -- --model tiny --language en --output_format srt audio.mp3
```

The image tags are formatted as `WHISPER_MODEL`-`LANG`, for example, `tiny-en`, `base-de`, `large-v3-zh` or `distil-large-v3-ja`.
The image tags are formatted as `WHISPER_MODEL`-`LANG`, for example, `tiny-en`, `base-de` or `large-v3-zh`.
Please be aware that the whisper models `*.en`, `large-v1`, `large-v2` have been excluded as I believe they are not frequently used. If you require these models, please refer to the following section to build them on your own.

You can find the actual build matrix in [docker_publish.yml](.github/workflows/docker_publish.yml#L212) and all available tags at [ghcr.io](https://github.com/jim60105/docker-whisperX/pkgs/container/whisperx/versions?filters%5Bversion_type%5D=tagged).

In addition, there is also a `no_model` tag that does not include any pre-downloaded models, also referred to as `latest`.

> Added a `distil-large-v3-en` model.
> Only en, distil model seems to only support English.
## ⚡️ Preserve the download cache for the align models when working with various languages

You can mount the `/.cache` to share align models between containers.
Expand Down

0 comments on commit 53c57b0

Please sign in to comment.