ci: seperate distil-large-v3-en Docker image build

- Add a new GitHub Actions job to build and push the `distil-large-v3-en` Docker image. - Remove the `distil-large-v3` model from the build matrix. - Rename and update the test job for the large-v3-zh Docker image. - Update the README to reflect changes in image tags and mention the addition of the `distil-large-v3-en` model. Signed-off-by: CHEN, CHUN <[email protected]>
jim60105 · Jan 7, 2025 · 53c57b0 · 53c57b0
1 parent f6662ed
commit 53c57b0
Show file tree

Hide file tree

Showing 2 changed files with 58 additions and 6 deletions.
diff --git a/.github/workflows/docker_publish.yml b/.github/workflows/docker_publish.yml
@@ -202,6 +202,56 @@ jobs:
           sbom: true
           provenance: true
 
+  # Build the distil-large-v3-en model (distil model seems to support only English)
+  docker-distil-large-v3-en:
+    # The type of runner that the job will run on
+    runs-on: ubuntu-latest
+    needs:
+      - docker-no_model # wait for docker-no_model to finish
+      - docker-cache # wait for docker-cache to finish
+
+    # Steps represent a sequence of tasks that will be executed as part of the job
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          submodules: true
+
+      - name: Setup docker
+        id: setup
+        uses: ./.github/workflows/docker-reused-steps
+        with:
+          tag: distil-large-v3-en
+
+      - name: Get short SHA
+        id: get-sha
+        run: |
+          id=$(echo ${{ github.sha }} | cut -c 1-7)
+          echo "id=$id" >> $GITHUB_OUTPUT
+
+      - name: Build and push:distil-large-v3-en
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          file: ./Dockerfile
+          target: final
+          push: true
+          tags: ${{ steps.setup.outputs.tags }}
+          labels: ${{ steps.setup.outputs.labels }}
+          build-args: |
+            WHISPER_MODEL=distil-large-v3
+            LANG=en
+            LOAD_WHISPER_STAGE=ghcr.io/jim60105/whisperx:cache-distil-large-v3-${{ steps.get-sha.outputs.id }}
+            NO_MODEL_STAGE=ghcr.io/jim60105/whisperx:no_model@${{ needs.docker-no_model.outputs.digest }}
+            VERSION=${{ github.ref_name }}
+            RELEASE=${{ github.run_number }}
+          platforms: linux/amd64, linux/arm64
+          # Cache to registry instead of gha to avoid the capacity limit.
+          cache-from: type=registry,ref=ghcr.io/${{ github.repository_owner }}/whisperx:cache
+          cache-to: type=registry,ref=ghcr.io/${{ github.repository_owner }}/whisperx:cache,mode=max
+          sbom: true
+          provenance: true
+
   # Run the rest of the builds in parallel
   docker:
     # The type of runner that the job will run on
@@ -256,7 +306,6 @@ jobs:
           - small
           - medium
           - large-v3
-          - distil-large-v3
     needs:
       - docker-no_model # wait for docker-no_model to finish
       - docker-cache # wait for docker-cache to finish
@@ -303,8 +352,8 @@ jobs:
           sbom: true
           provenance: true
 
-  test-distil-large-v3-zh:
-    name: Test distil-large-v3-zh docker image
+  test-large-v3-zh:
+    name: Test large-v3-zh docker image
     runs-on: ubuntu-latest
     needs: docker
     steps:
@@ -333,9 +382,9 @@ jobs:
           id=$(echo ${{ github.sha }} | cut -c 1-7)
           echo "id=$id" >> $GITHUB_OUTPUT
 
-      - name: Test distil-large-v3-zh docker image
+      - name: Test large-v3-zh docker image
         run: |
-          docker run --group-add 0 -v ".:/app" ghcr.io/jim60105/whisperx:distil-large-v3-zh-${{ steps.get-sha.outputs.id }} -- --device cpu --compute_type int8 --output_format srt .github/workflows/test/zh.webm;
+          docker run --group-add 0 -v ".:/app" ghcr.io/jim60105/whisperx:large-v3-zh-${{ steps.get-sha.outputs.id }} -- --device cpu --compute_type int8 --output_format srt .github/workflows/test/zh.webm;
           if [ ! -f zh.srt ]; then
             echo "The zh.srt file does not exist"
             exit 1

diff --git a/README.md b/README.md
@@ -42,13 +42,16 @@ docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:large-v3-ja -- -
 docker run --gpus all -it -v ".:/app" ghcr.io/jim60105/whisperx:no_model    -- --model tiny --language en --output_format srt audio.mp3
 ```
 
-The image tags are formatted as `WHISPER_MODEL`-`LANG`, for example, `tiny-en`, `base-de`, `large-v3-zh` or `distil-large-v3-ja`.  
+The image tags are formatted as `WHISPER_MODEL`-`LANG`, for example, `tiny-en`, `base-de` or `large-v3-zh`.  
 Please be aware that the whisper models `*.en`,  `large-v1`, `large-v2` have been excluded as I believe they are not frequently used. If you require these models, please refer to the following section to build them on your own.
 
 You can find the actual build matrix in [docker_publish.yml](.github/workflows/docker_publish.yml#L212) and all available tags at [ghcr.io](https://github.com/jim60105/docker-whisperX/pkgs/container/whisperx/versions?filters%5Bversion_type%5D=tagged).
 
 In addition, there is also a `no_model` tag that does not include any pre-downloaded models, also referred to as `latest`.
 
+> Added a `distil-large-v3-en` model.  
+> Only en, distil model seems to only support English.
+
 ## ⚡️ Preserve the download cache for the align models when working with various languages
 
 You can mount the `/.cache` to share align models between containers.