-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vector search documentation #9135
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
|:---|:---|:---|:---|:---| | ||
| Max dimensions | 16,000 | 16,000 | 16,000 | 16,000 | | ||
| Filter | Post-filter | Post-filter | Post-filter | Filter during search | | ||
| Training required | No | No | Yes | No | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Faiss HNSW with PQ also requires training
redirect_from: | ||
- /search-plugins/knn/knn-vector-quantization/ | ||
outside_cards: | ||
- heading: "Byte vectors" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add a card for binary vectors along with byte vectors
https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector#binary-vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naveentatikonda Thanks! I addressed both comments. Could you review this commit a5e8b8d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws changes looks good. Thanks for making those changes.
For binary vectors, we also need to add memory estimation. Here is the formula for HNSW
1.1 * (dimension / 8 + 8 * M) bytes/vector
For IVF, I guess it is 1.1 * (((dimension / 8) * num_vectors) + (nlist * dimension / 8))
. @jmazanec15 can you pls confirm ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that looks good to me
Signed-off-by: Fanit Kolchina <[email protected]>
Hi @kolchfa-aws, this looks awesome! In general, I think we should start moving the more low-level/expert details (like quantization and method configuration), out of vector search section and into detailed field reference section. Here is some high level feedback:
In performance tuning, we can mention picking a specific engine or specifying overriding method parameters for expert level fine tuning and point to reference docs.
Quantization is a bit ugly for users to have to understand. So I think its better to belong in detailed field reference. We can say, for further fine tuning of the quantization methods, see field reference. |
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Adds a vector search section
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.