Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Ryan McCormick <[email protected]>
  • Loading branch information
oandreeva-nv and rmccorm4 authored Oct 25, 2024
1 parent 70cac18 commit dc9ee05
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions Conceptual_Guide/Part_8-semantic_caching/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ This approach offers several benefits including, but not limited to:
## Sample Reference Implementation

In this tutorial we provide a reference implementation for a Semantic Cache in
[semantic_caching.py.](./artifacts/semantic_caching.py) There are 3 key
[semantic_caching.py](./artifacts/semantic_caching.py). There are 3 key
dependencies:
* [SentenceTransformer](https://sbert.net/): a Python framework for computing
dense vector representations (embeddings) of sentences, paragraphs, and images.
Expand All @@ -104,7 +104,7 @@ clustering of dense vectors.
algorithms.
- Alternatives include [annoy](https://github.com/spotify/annoy), or
[cuVS](https://github.com/rapidsai/cuvs). However, note that cuVS already
has an integration in Faiss, more on this can be found [here.](https://docs.rapids.ai/api/cuvs/nightly/integrations/faiss/)
has an integration in Faiss, more on this can be found [here](https://docs.rapids.ai/api/cuvs/nightly/integrations/faiss/).
* [Theine](https://github.com/Yiling-J/theine): High performance in-memory
cache.
- We will use it as our exact match cache backend. After the most similar
Expand Down Expand Up @@ -151,15 +151,15 @@ section. However, for those interested in understanding the specifics,
let's explore what this patch includes.

The patch introduces a new script,
[semantic_caching.py.](./artifacts/semantic_caching.py), which is added to the
[semantic_caching.py](./artifacts/semantic_caching.py), which is added to the
appropriate directory. This script implements the core logic for our
semantic caching functionality.

Next, the patch integrates semantic caching into the model. Let's walk through
these changes step-by-step.

Firstly, it imports the necessary classes from
[semantic_caching.py.](./artifacts/semantic_caching.py) into the codebase:
[semantic_caching.py](./artifacts/semantic_caching.py) into the codebase:

```diff
...
Expand Down Expand Up @@ -353,7 +353,7 @@ supported feature in Triton Inference Server.

We value your input! If you're interested in seeing semantic caching as a
supported feature in future releases, we invite you to join the ongoing
[discussion.](https://github.com/triton-inference-server/server/discussions/7742)
[discussion](https://github.com/triton-inference-server/server/discussions/7742).
Provide details about why you think semantic caching would
be valuable for your use case. Your feedback helps shape our product roadmap,
and we appreciate your contributions to making our software better for everyone.

0 comments on commit dc9ee05

Please sign in to comment.