-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(mongodb): remove embeddings from top_n
lookup
#115
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cvauclair
requested changes
Nov 19, 2024
cvauclair
approved these changes
Nov 22, 2024
mateobelanger
added a commit
that referenced
this pull request
Dec 2, 2024
* fix: exclude embedding properties from top_n node query * refactor: more ergonomic index creation * docs(neo4j): update examples * fix: unused import in example * feat(provider): xAI (grok) integration (#106) * feat(xai): initial xai (grok) implementation * fix(xai): renamings + tests * style(xai): Update rig-core/src/providers/xai/client.rs Co-authored-by: Mathieu Bélanger <[email protected]> * style(xai): adds various comments and README improvements * fix(xai): add some print statements to the grok example * docs(xai): fix readme --------- Co-authored-by: Mathieu Bélanger <[email protected]> * fix(rig-mongodb): remove embeddings from `top_n` lookup (#115) * fix(mongodb): remove embeddings from `top_n` lookup * fix(mongodb): filter embeddings within agg pipeline * style(mongodb): clippy moment * fix(mongodb): dynamically get embedded fields from mongodb * fix(mongodb): apply fixes from comments * style(mongodb): fmt * docs(readme): add perplexity logo to integrations (#112) * docs(readme): add perplexity logo to integrations * fix: perplexity logo size * fix(readme): perplexity logo size * feat: embeddings API overhaul (#120) * feat: setup derive macro * test: test out writing embeddable macro * test: continue testing custom macro implementation * feat: macro generate trait bounds * refactor: split up macro into multiple files * refactor: move macro derive crate inside rig-core * feat: replace embedding logic with new embeddable trait and macro * refactor: refactor rag examples, delete document embedding struct * feat: remove document embedding from in memory store * refactor: remove DocumentEmbeddings from in memory vector store * refactor(examples): combine vector store with vector store index * docs: add and update docstrings * fix (examples): fix bugs in examples * style: cargo fmt * revert: revert vector store to main * docs: update emebddings builder docstrings * refactor: derive macro * tests: add unit tests on in memory store * fic(ci): asterix on pull request sto accomodate for epic branches * fix(ci): double asterix * feat: add error type on embeddable trait * refactor: move embeddings to its own module and seperate embeddable * refactor: split up macro into more files, fix all imports * fix: revert logging change * feat: handle tools with embeddingsbuilder * bug(macro): fix error when embed tags missing * style: cargo fmt * fix(tests): clippy * docs&revert: revert embeddable trait error type, add docstrings * style: cargo clippy * clippy(lancedb): fix unused function error * fix(test): remove useless assert false statement * cleanup: split up branch into 2 branches for readability * cleanup: revert certain changes during branch split * docs: revert doc string * fix: add embedding_docs to embeddable tool * refactor: use OneOrMany in Embbedable trait, make derive macro crate feature flag * tests: add some more tests * clippy: cargo clippy * docs: add docstring to oneormany * fix(macro): update error handling * refactor: reexport EmbeddingsBuilder in rig and update imports * feat: implement IntoIterator and Iterator for OneOrMany * refactor: rename from methods * tests: fix failing tests * refactor&fix: make PR review changes * fix: fix tests failing * test: add test on OneOrMany * style: cargo fmt * docs&fix: fix doc strings, implement iter_mut for OneOrMany * fix: update borrow and owning of macro * clippy: add back print statements * fix: fix issues caused by merge of derive macro branch * fix: fix cargo toml of lancedb and mongodb * refactor: use thiserror for OneOtMany::EmptyListError * feat: add OneOrMany to in memory vector store * style: cargo fmt * fix: update embeddingsbuilder import path * tests: add tests for embeddingsbuilder * clippy: add is empty method * fix: add feature flag to examples in mongodb and lancedb crates * fix: move lancedb fixtures into it's own file * fix: add dummy main function in fextures.rs for compiler * fix: revert fixture file, remove fixtures from cargo toml examples * fix: update fixture import in lancedb examples * refactor: rename D to T in embeddingsbuilder generics * refactor: remove clone * PR: update builder, docstrings, and std::markers tags * style: replace add with push * fix: fix mongodb example * fix: update lancedb and mongodb doc example * fix: typo * docs: add and fix docstrings and examples * docs: add more doc tests * feat: rename Embeddable trait to ExtractEmbeddingFields * feat: rename macro files, cargo fmt * PR; update docstrings, update `add_documents_with_id` function * doc: fix doc linting * misc: fmt * test: fix test * refactor(embeddings): embed trait definition (#89) * refactor: Big refactor * refactor: refactor Embed trait, fix all imports, rename files, fix macro * fix(embed trait): fix errors while testing * fix(lancedb): examples * docs: fix hyperlink * fmt: cargo fmt * PR; make requested changes * fix: change visibility of struct field * fix: failing tests --------- Co-authored-by: Christophe <[email protected]> * fix/docs: fix erros from merge, cleanup embeddings docstrings * fix: cargo clippy in examples * Feat: small improvements + fixes + tests (#128) * docs: Make examples+docstrings a bit more realistic * feat: Add Embed implementation for &impl Embed * test: Reorganize tests * misc: Add `derive` feature to `all` feature flag * test: Fix dead code warning * test: Improve embed macro tests * test: Add additional embed macro test * docs: Add logging output to rag example * docs: Fix looging output in tools example * feat: Improve token usage log messages * test: Small changes to embedbing builder tests * style: cargo fmt * fix: Clippy + docstrings * docs: Fix docstring * test: Fix test * style: Small renaming for consistency * docs: Improve docstrings * style: fmt * fix: `TextEmbedder::embed` visibility * docs: Simplified the `EmbeddingsBuilder` docstring example to focus on the builder * style: cargo fmt * docs: Small edit to lancedb examples --------- Co-authored-by: cvauclair <[email protected]> * misc: Add `rig-derive` missing manifest fields (#129) * feat: Improve `InMemoryVectorStore` API (#130) * feat: Improve `InMemoryVectorStore` API * style: clippy+fmt * test: fix test * fix: remove unused module (#132) * fix: exclude embedding properties from top_n node query * refactor: more ergonomic index creation * docs(neo4j): update examples * fix: unused import in example * fix(example): remove embedding field from Deserialization type --------- Co-authored-by: Mochan <[email protected]> Co-authored-by: Garance Buricatu <[email protected]> Co-authored-by: cvauclair <[email protected]>
Merged
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Eliminate the embeddings field from the top_n lookup results to streamline the response structure. Also adds a
DocumentResponse
type that doesn't include the embeddings field to be used withtop_n
responses.Additionally, this PR un-hardcoded
embeddings.vec
and dynamically looks up the embedded field from the vector search index (which also confirms whether it exists or not properly during construction).Implementation
Update: We make a call to mongodb to check all search indexes. We only handle 1 index w/ 1 indexed embedded field but this could be updated in the future if needed.