-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory usage optimization for sparse vector #1011
Conversation
@sparknack 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1011 +/- ##
=========================================
+ Coverage 0 73.91% +73.91%
=========================================
Files 0 82 +82
Lines 0 6981 +6981
=========================================
+ Hits 0 5160 +5160
- Misses 0 1821 +1821 |
/kind enhancement |
e21d565
to
cb64ab4
Compare
issue: #967 |
include/knowhere/sparse_utils.h
Outdated
|
||
private: | ||
std::vector<table_t> docids_; | ||
mutable size_t pos_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a relatively untypical solution. The more traditional solution would be make pos_
non-mutable and remove constness from test()
in order to remove the confusion. Maybe, even change test()
name, because it is not really test()
. Or one may remove const
-ness in a Cursor
(see a comment below)
Would you please elaborate the reason for such an unusual implementation?
size_t loc_ = 0; | ||
size_t total_num_vec_ = 0; | ||
float max_score_ = 0.0f; | ||
float q_value_ = 0.0f; | ||
const BitsetView bitset_; | ||
const DocIdFilter filter_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason for making this const
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class DocIdFilterByVector
is actually another form of BitsetView
, so the template typename DocIdFilter
could be BitsetView
or DocIdFilterByVector
.
knowhere::sparse::InvertedIndex::Search()
takes a const BitsetView& bitset
as an argument. And Cursor
will use this argument to iterate through the entire posting list.
To keep consistency, the type of DocIdFilter
also needs to be a const
type.
cb64ab4
to
95054e3
Compare
@sparknack Please consider the following change (I was able to compile knowhere with unit tests): diff --git a/include/knowhere/sparse_utils.h b/include/knowhere/sparse_utils.h
index ddb7c3cd..56fbfc70 100644
--- a/include/knowhere/sparse_utils.h
+++ b/include/knowhere/sparse_utils.h
@@ -67,7 +67,7 @@ class DocIdFilterByVector {
}
[[nodiscard]] bool
- test(const table_t id) const {
+ test(const table_t id) {
// find the first id that is greater than or equal to the specific id
while (pos_ < docids_.size() && docids_[pos_] < id) {
++pos_;
@@ -82,7 +82,7 @@ class DocIdFilterByVector {
private:
std::vector<table_t> docids_;
- mutable size_t pos_ = 0;
+ size_t pos_ = 0;
};
template <typename T>
diff --git a/src/index/sparse/sparse_inverted_index.h b/src/index/sparse/sparse_inverted_index.h
index d470cc65..a5a9975a 100644
--- a/src/index/sparse/sparse_inverted_index.h
+++ b/src/index/sparse/sparse_inverted_index.h
@@ -608,7 +608,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
size_t total_num_vec_ = 0;
float max_score_ = 0.0f;
float q_value_ = 0.0f;
- const DocIdFilter filter_;
+ DocIdFilter filter_;
table_t cur_vec_id_ = 0;
private:
@@ -631,7 +631,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
template <typename DocIdFilter>
void
search_brute_force(const SparseRow<DType>& q_vec, DType q_threshold, MaxMinHeap<float>& heap,
- const DocIdFilter& filter, const DocValueComputer<float>& computer) const {
+ DocIdFilter& filter, const DocValueComputer<float>& computer) const {
auto scores = compute_all_distances(q_vec, q_threshold, computer);
for (size_t i = 0; i < n_rows_internal_; ++i) {
if ((filter.empty() || !filter.test(i)) && scores[i] != 0) {
I still insist that |
Sorry, I accidentally closed this. I will redo it again later. |
@alexanderguzhva Thanks for your demo change. I will do it in the next pr :) |
@alexanderguzhva |
95054e3
to
c83158a
Compare
c83158a
to
ff00223
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for the effort! this is awesome! this PR basically looks good to me, with some nit comments.
@@ -59,6 +60,31 @@ GetDocValueBM25Computer(float k1, float b, float avgdl) { | |||
}; | |||
} | |||
|
|||
class DocIdFilterByVector { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comment to note that all ids to be tested must be tested exactly once and in order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
@@ -54,14 +53,12 @@ class SparseInvertedIndexNode : public IndexNode { | |||
LOG_KNOWHERE_ERROR_ << Type() << " only support metric_type IP or BM25"; | |||
return Status::invalid_metric_type; | |||
} | |||
auto drop_ratio_build = cfg.drop_ratio_build.value_or(0.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment where this is defined noting it is now deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
|
||
virtual expected<DocValueComputer<T>> | ||
GetDocValueComputer(const SparseInvertedIndexConfig& cfg) const = 0; | ||
|
||
virtual bool | ||
[[nodiscard]] virtual bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this method be removed as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
@@ -156,51 +158,67 @@ class InvertedIndex : public BaseInvertedIndex<T> { | |||
* | |||
* 1. size_t rows | |||
* 2. size_t cols | |||
* 3. T value_threshold_ | |||
* 3. T value_threshold_ (deprecated) | |||
* 4. for each row: | |||
* 1. size_t len | |||
* 2. for each non-zero value: | |||
* 1. table_t idx | |||
* 2. T val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. DType val (when QType is different from DType, the QType value of val is stored as a DType with precision loss)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
@@ -273,7 +250,7 @@ class SparseInvertedIndexNode : public IndexNode { | |||
return index_or.error(); | |||
} | |||
index_ = index_or.value(); | |||
return index_->Load(reader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is for "DeserializeFromFile".
now that we don't have raw data in the index, can we munmap the raw index file after loading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
tests/ut/test_sparse.cc
Outdated
auto metric = GENERATE(knowhere::metric::IP, knowhere::metric::BM25); | ||
|
||
auto [drop_ratio_build, drop_ratio_search] = metric == knowhere::metric::BM25 ? GENERATE(table<float, float>({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove all drop_ratio_build
from the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
ff00223
to
c78564e
Compare
/lgtm |
|
||
template <typename U> | ||
using Vector = std::conditional_t<mmapped, GrowableVectorView<U>, std::vector<U>>; | ||
std::unordered_map<uint32_t, table_t> dim_map_reverse_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to avoid having this if possible, this basically doubles the memory usage of dim_map.
instead of storing it as a member, can we reconstruct dim_map_reverse_
in Save
from dim_map_
?
To reduce memory usage, remove the raw data cache in sparse inverted index. Note that config param `drop_ratio_build` and GetVectorByIds() are also be removed. Signed-off-by: Shawn Wang <[email protected]>
bb5ec9a
to
b929c83
Compare
To reduce the memory usage of the inverted index, when BM25 metric is used, quantize term frequency from float to uint16_t. All values that exceeds the maximum of uint16_t is quantized to the maximum of uint16_t. Signed-off-by: Shawn Wang <[email protected]>
b929c83
to
683f40a
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sparknack, zhengbuqian The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature |
@sparknack Please remove redundant |
@zhengbuqian: Those labels are not set on the issue: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-kind enhancement |
/remove-kind feature |
This patch series adds two memory optimization methods:
drop_ratio_build
and GetVectorByIds() are also removed.