This repository has been archived by the owner on Apr 4, 2023. It is now read-only.
Store fuzzy/bucketed positions in word_position_docids
database
#746
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request
Related issue
Fixes (when merged into meilisearch) meilisearch/meilisearch#3222
Implementation
The design is described well in the related issue. For details of how different relative positions are grouped together, see the test
bucketed_position
.Basically, we no longer store the exact position of words that appear far into an attribute, but instead group relative positions together in buckets whose size grows exponentially with the original position. This is done to improve the relevancy and the performance of the
attribute
ranking rule.This is a draft until #742 is merged and the results of the benchmarks are available.
EDIT: I also realised just now that the iterative version of the algorithm needs to be updated as well!