Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to v1.2 #5

Merged
merged 30 commits into from
Apr 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b7be6c0
feat: add cuda all image to facilitate deployment (#186)
OlivierDehaene Mar 5, 2024
9ab2f2c
feat: add splade pooling to Bert (#187)
OlivierDehaene Mar 6, 2024
ec04b9d
feat: support vertex api endpoint (#184)
drbh Mar 6, 2024
e7ae777
docs: readme examples (#180)
plaggy Mar 7, 2024
2b8ad5f
fix: add_pooling_layer for bert classification (#190)
OlivierDehaene Mar 7, 2024
7efa697
feat: add /embed_sparse route (#191)
OlivierDehaene Mar 7, 2024
2d1776f
docs: add http feature
OlivierDehaene Mar 12, 2024
0b40ade
Applying `Cargo.toml` optimization options (#201)
somehowchris Mar 18, 2024
6fa3c6a
feat: Add Dockerfile-arm64 to allow docker builds on Apple M1/M2 arc…
iandoe Mar 21, 2024
1d6f288
feat: configurable payload limit (#210)
OlivierDehaene Mar 21, 2024
5e60d06
feat: add api_key for request authorization (#211)
OlivierDehaene Mar 21, 2024
a57cf61
feat: add all methods to vertex API (#192)
OlivierDehaene Mar 21, 2024
90ea664
feat: add `/decode` route (#212)
OlivierDehaene Mar 22, 2024
a1dd76d
Input Types Compatibility with OpenAI's API (#112) (#214)
OlivierDehaene Mar 22, 2024
3edace2
v1.2.0 (#215)
OlivierDehaene Mar 22, 2024
53e28e0
Document how to send batched inputs (#222)
osanseviero Apr 2, 2024
68d63ed
feat: add auto-truncate arg (#224)
OlivierDehaene Apr 2, 2024
a556f43
feat: add PredictPair to proto (#225)
OlivierDehaene Apr 2, 2024
eef2912
fix: fix auto_truncate for openai (#228)
OlivierDehaene Apr 4, 2024
3c385a4
Change license to Apache 2.0 (#231)
OlivierDehaene Apr 8, 2024
432448c
feat: Amazon SageMaker compatible images (#103)
JGalego Apr 11, 2024
cb802a2
fix(CI): fix build all (#236)
OlivierDehaene Apr 11, 2024
0b07f9b
fix: fix cuda-all image (#239)
OlivierDehaene Apr 15, 2024
1477844
Add SageMaker CPU images and validate (#240)
philschmid Apr 15, 2024
8927093
v1.2.1
OlivierDehaene Apr 15, 2024
1108a10
fix(gke): accept null values for vertex env vars (#243)
OlivierDehaene Apr 16, 2024
22f6fd7
v1.2.2
OlivierDehaene Apr 16, 2024
d221b99
hotfix v1.2.2
OlivierDehaene Apr 16, 2024
1ba0379
Merge branch 'main' into v1.2-release
regisss Apr 18, 2024
2d0fe35
Update dependencies
regisss Apr 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 0 additions & 12 deletions .github/workflows/build_75.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,6 @@
- 'main'
tags:
- 'v*'
pull_request:
paths:
- ".github/workflows/build_75.yaml"
# - "integration-tests/**"
- "backends/**"
- "core/**"
- "router/**"
- "Cargo.lock"
- "rust-toolchain.toml"
- "Dockerfile"
branches:
- 'main'

jobs:
build-and-push-image:
Expand Down
12 changes: 0 additions & 12 deletions .github/workflows/build_86.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,6 @@
- 'main'
tags:
- 'v*'
pull_request:
paths:
- ".github/workflows/build.yaml"
# - "integration-tests/**"
- "backends/**"
- "core/**"
- "router/**"
- "Cargo.lock"
- "rust-toolchain.toml"
- "Dockerfile"
branches:
- 'main'

jobs:
build-and-push-image:
Expand Down
12 changes: 0 additions & 12 deletions .github/workflows/build_89.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,6 @@
- 'main'
tags:
- 'v*'
pull_request:
paths:
- ".github/workflows/build.yaml"
# - "integration-tests/**"
- "backends/**"
- "core/**"
- "router/**"
- "Cargo.lock"
- "rust-toolchain.toml"
- "Dockerfile"
branches:
- 'main'

jobs:
build-and-push-image:
Expand Down
12 changes: 0 additions & 12 deletions .github/workflows/build_90.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,6 @@
- 'main'
tags:
- 'v*'
pull_request:
paths:
- ".github/workflows/build.yaml"
# - "integration-tests/**"
- "backends/**"
- "core/**"
- "router/**"
- "Cargo.lock"
- "rust-toolchain.toml"
- "Dockerfile"
branches:
- 'main'

jobs:
build-and-push-image:
Expand Down
105 changes: 105 additions & 0 deletions .github/workflows/build_all.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
name: Build and push Cuda docker image to registry

on:
workflow_dispatch:
push:
tags:
- 'v*'

jobs:
build-and-push-image:
concurrency:
group: ${{ github.workflow }}-${{ github.job }}-all-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
runs-on: [self-hosted, intel-cpu, 32-cpu, tgi-ci]
permissions:
contents: write
packages: write
# This is used to complete the identity challenge
# with sigstore/fulcio when running outside of PRs.
id-token: write
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Initialize Docker Buildx
uses: docker/[email protected]
with:
install: true
- name: Inject slug/short variables
uses: rlespinasse/[email protected]
- name: Tailscale
uses: tailscale/github-action@7bd8039bf25c23c4ab1b8d6e2cc2da2280601966
with:
authkey: ${{ secrets.TAILSCALE_AUTHKEY }}
- name: Login to GitHub Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Login to internal Container Registry
uses: docker/[email protected]
with:
username: ${{ secrets.TAILSCALE_DOCKER_USERNAME }}
password: ${{ secrets.TAILSCALE_DOCKER_PASSWORD }}
registry: registry.internal.huggingface.tech
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/[email protected]
with:
images: |
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
ghcr.io/huggingface/text-embeddings-inference
flavor: |
latest=false
tags: |
type=semver,pattern=cuda-{{version}}
type=semver,pattern=cuda-{{major}}.{{minor}}
type=raw,value=cuda-latest,enable=${{ github.ref == format('refs/heads/{0}', github.event.repository.default_branch) }}
type=raw,value=cuda-sha-${{ env.GITHUB_SHA_SHORT }}
- name: Build and push Docker image
id: build-and-push
uses: docker/build-push-action@v4
with:
context: .
file: Dockerfile-cuda-all
push: ${{ github.event_name != 'pull_request' }}
platforms: 'linux/amd64'
build-args: |
GIT_SHA=${{ env.GITHUB_SHA }}
DOCKER_LABEL=sha-${{ env.GITHUB_SHA_SHORT }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=registry,ref=registry.internal.huggingface.tech/api-inference/text-embeddings-inference:cache-all,mode=max
cache-to: type=registry,ref=registry.internal.huggingface.tech/api-inference/text-embeddings-inference:cache-all,mode=max
- name: Extract metadata (tags, labels) for Docker
id: meta-sagemaker
uses: docker/[email protected]
with:
images: |
registry.internal.huggingface.tech/api-inference/text-embeddings-inference/sagemaker
flavor: |
latest=false
tags: |
type=semver,pattern=cuda-{{version}}
type=semver,pattern=cuda-{{major}}.{{minor}}
type=raw,value=cuda-latest,enable=${{ github.ref == format('refs/heads/{0}', github.event.repository.default_branch) }}
type=raw,value=cuda-sha-${{ env.GITHUB_SHA_SHORT }}
- name: Build and push Docker image
id: build-and-push-sagemaker
uses: docker/build-push-action@v4
with:
context: .
file: Dockerfile-cuda-all
push: ${{ github.event_name != 'pull_request' }}
platforms: 'linux/amd64'
target: sagemaker
build-args: |
GIT_SHA=${{ env.GITHUB_SHA }}
DOCKER_LABEL=sha-${{ env.GITHUB_SHA_SHORT }}
tags: ${{ steps.meta-sagemaker.outputs.tags }}
labels: ${{ steps.meta-sagemaker.outputs.labels }}
cache-from: type=registry,ref=registry.internal.huggingface.tech/api-inference/text-embeddings-inference:cache-all,mode=max
cache-to: type=registry,ref=registry.internal.huggingface.tech/api-inference/text-embeddings-inference:cache-all,mode=max
Loading
Loading