Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Embeddings service to the AI integration packages #4368

Merged
merged 11 commits into from
Feb 2, 2025
Merged

Conversation

IMax153
Copy link
Member

@IMax153 IMax153 commented Jan 30, 2025

Type

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

Related

  • Related Issue #
  • Closes #

@IMax153 IMax153 requested a review from tim-smart as a code owner January 30, 2025 18:48
Copy link

changeset-bot bot commented Jan 30, 2025

🦋 Changeset detected

Latest commit: 9c21de3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@effect/ai-openai Patch
@effect/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

@tim-smart tim-smart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to yield here

}).pipe(Layer.provide(NodeHttpClient.layerUndici))

// Create an embeddings service for the `text-embedding-3-large` model
const TextEmbeddingsLarge = OpenAiEmbeddings.layerDataLoader({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it batch if I use

(documents: Array<string>) => Effect.forEach(documents, embedding.embed, { concurrency: "unbounded" })

and OpenAiEmbeddings.layer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would need to add batching true to the forEach options as under the hood it's using RequestResolver.batched.

Unless you use the data loader variant, in which case it will always batch as much as it can during the specified window (up to the max batch size).

}),
Effect.map((response) =>
response.data.map(({ embedding, index }) => ({
embeddings: embedding as Array<number>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we define EmbeddingVector data type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a real reason to do so unless I'm missing something?

I'm just casting from a read only array to a mutable one here.

@IMax153 IMax153 requested a review from tim-smart January 31, 2025 12:55
@IMax153 IMax153 force-pushed the feat/embeddings branch 2 times, most recently from bf8ee12 to d8ef64b Compare January 31, 2025 15:38
@IMax153 IMax153 dismissed tim-smart’s stale review February 2, 2025 12:48

Discussed offline and approved

@IMax153 IMax153 merged commit a0c85e6 into main Feb 2, 2025
12 checks passed
@IMax153 IMax153 deleted the feat/embeddings branch February 2, 2025 12:52
@github-actions github-actions bot mentioned this pull request Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants