Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory usage optimization for sparse vector #1011

Merged
merged 2 commits into from
Jan 7, 2025

Conversation

sparknack
Copy link
Contributor

@sparknack sparknack commented Dec 30, 2024

This patch series adds two memory optimization methods:

  1. Remove the raw data cache in sparse inverted index. Note that config param drop_ratio_build and GetVectorByIds() are also removed.
  2. Force quantize float to uint16_t for BM25 inverted index.

Copy link

mergify bot commented Dec 30, 2024

@sparknack 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

Copy link

codecov bot commented Dec 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.91%. Comparing base (3c46f4c) to head (683f40a).
Report is 283 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main    #1011       +/-   ##
=========================================
+ Coverage      0   73.91%   +73.91%     
=========================================
  Files         0       82       +82     
  Lines         0     6981     +6981     
=========================================
+ Hits          0     5160     +5160     
- Misses        0     1821     +1821     

see 82 files with indirect coverage changes

@sparknack
Copy link
Contributor Author

/kind enhancement

@sparknack sparknack force-pushed the sparse-mem-opt branch 2 times, most recently from e21d565 to cb64ab4 Compare December 30, 2024 10:07
@sparknack
Copy link
Contributor Author

issue: #967


private:
std::vector<table_t> docids_;
mutable size_t pos_ = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a relatively untypical solution. The more traditional solution would be make pos_ non-mutable and remove constness from test() in order to remove the confusion. Maybe, even change test() name, because it is not really test(). Or one may remove const-ness in a Cursor (see a comment below)

Would you please elaborate the reason for such an unusual implementation?

size_t loc_ = 0;
size_t total_num_vec_ = 0;
float max_score_ = 0.0f;
float q_value_ = 0.0f;
const BitsetView bitset_;
const DocIdFilter filter_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for making this const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class DocIdFilterByVector is actually another form of BitsetView, so the template typename DocIdFilter could be BitsetView or DocIdFilterByVector.

knowhere::sparse::InvertedIndex::Search() takes a const BitsetView& bitset as an argument. And Cursor will use this argument to iterate through the entire posting list.

To keep consistency, the type of DocIdFilter also needs to be a const type.

@alexanderguzhva
Copy link
Collaborator

alexanderguzhva commented Dec 31, 2024

@sparknack Please consider the following change (I was able to compile knowhere with unit tests):

diff --git a/include/knowhere/sparse_utils.h b/include/knowhere/sparse_utils.h
index ddb7c3cd..56fbfc70 100644
--- a/include/knowhere/sparse_utils.h
+++ b/include/knowhere/sparse_utils.h
@@ -67,7 +67,7 @@ class DocIdFilterByVector {
     }
 
     [[nodiscard]] bool
-    test(const table_t id) const {
+    test(const table_t id) {
         // find the first id that is greater than or equal to the specific id
         while (pos_ < docids_.size() && docids_[pos_] < id) {
             ++pos_;
@@ -82,7 +82,7 @@ class DocIdFilterByVector {
 
  private:
     std::vector<table_t> docids_;
-    mutable size_t pos_ = 0;
+    size_t pos_ = 0;
 };
 
 template <typename T>
diff --git a/src/index/sparse/sparse_inverted_index.h b/src/index/sparse/sparse_inverted_index.h
index d470cc65..a5a9975a 100644
--- a/src/index/sparse/sparse_inverted_index.h
+++ b/src/index/sparse/sparse_inverted_index.h
@@ -608,7 +608,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
         size_t total_num_vec_ = 0;
         float max_score_ = 0.0f;
         float q_value_ = 0.0f;
-        const DocIdFilter filter_;
+        DocIdFilter filter_;
         table_t cur_vec_id_ = 0;
 
      private:
@@ -631,7 +631,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
     template <typename DocIdFilter>
     void
     search_brute_force(const SparseRow<DType>& q_vec, DType q_threshold, MaxMinHeap<float>& heap,
-                       const DocIdFilter& filter, const DocValueComputer<float>& computer) const {
+                       DocIdFilter& filter, const DocValueComputer<float>& computer) const {
         auto scores = compute_all_distances(q_vec, q_threshold, computer);
         for (size_t i = 0; i < n_rows_internal_; ++i) {
             if ((filter.empty() || !filter.test(i)) && scores[i] != 0) {

I still insist that const guarantees should not be violated and mutable is a last resort resolution. The code is expected to clearly express the way it works :) And it is a caller's responsibility to use const cast in this case

@sparknack sparknack closed this Dec 31, 2024
@sparknack
Copy link
Contributor Author

Sorry, I accidentally closed this. I will redo it again later.

@sparknack sparknack reopened this Dec 31, 2024
@mergify mergify bot removed the ci-passed label Dec 31, 2024
@sparknack
Copy link
Contributor Author

@alexanderguzhva Thanks for your demo change. I will do it in the next pr :)

@mergify mergify bot added the ci-passed label Dec 31, 2024
@sparknack
Copy link
Contributor Author

diff --git a/include/knowhere/sparse_utils.h b/include/knowhere/sparse_utils.h
index ddb7c3cd..56fbfc70 100644
--- a/include/knowhere/sparse_utils.h
+++ b/include/knowhere/sparse_utils.h
@@ -67,7 +67,7 @@ class DocIdFilterByVector {
     }
 
     [[nodiscard]] bool
-    test(const table_t id) const {
+    test(const table_t id) {
         // find the first id that is greater than or equal to the specific id
         while (pos_ < docids_.size() && docids_[pos_] < id) {
             ++pos_;
@@ -82,7 +82,7 @@ class DocIdFilterByVector {
 
  private:
     std::vector<table_t> docids_;
-    mutable size_t pos_ = 0;
+    size_t pos_ = 0;
 };
 
 template <typename T>
diff --git a/src/index/sparse/sparse_inverted_index.h b/src/index/sparse/sparse_inverted_index.h
index d470cc65..a5a9975a 100644
--- a/src/index/sparse/sparse_inverted_index.h
+++ b/src/index/sparse/sparse_inverted_index.h
@@ -608,7 +608,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
         size_t total_num_vec_ = 0;
         float max_score_ = 0.0f;
         float q_value_ = 0.0f;
-        const DocIdFilter filter_;
+        DocIdFilter filter_;
         table_t cur_vec_id_ = 0;
 
      private:
@@ -631,7 +631,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
     template <typename DocIdFilter>
     void
     search_brute_force(const SparseRow<DType>& q_vec, DType q_threshold, MaxMinHeap<float>& heap,
-                       const DocIdFilter& filter, const DocValueComputer<float>& computer) const {
+                       DocIdFilter& filter, const DocValueComputer<float>& computer) const {
         auto scores = compute_all_distances(q_vec, q_threshold, computer);
         for (size_t i = 0; i < n_rows_internal_; ++i) {
             if ((filter.empty() || !filter.test(i)) && scores[i] != 0) {

@alexanderguzhva
And I still have one question: after your change, knowhere::sparse::InvertedIndex::Search() uses a const BitsetView& bitset while knowhere::sparse::InvertedIndex::search_brute_force() uses a non-const one. So is there an implicit const cast when calling knowhere::sparse::InvertedIndex::search_brute_force()? If so, isn't this implicit conversion dangerous? (I am not familiar with the C++ mechanism, maybe there is a reason.)

Copy link
Collaborator

@zhengbuqian zhengbuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the effort! this is awesome! this PR basically looks good to me, with some nit comments.

@@ -59,6 +60,31 @@ GetDocValueBM25Computer(float k1, float b, float avgdl) {
};
}

class DocIdFilterByVector {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment to note that all ids to be tested must be tested exactly once and in order

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@@ -54,14 +53,12 @@ class SparseInvertedIndexNode : public IndexNode {
LOG_KNOWHERE_ERROR_ << Type() << " only support metric_type IP or BM25";
return Status::invalid_metric_type;
}
auto drop_ratio_build = cfg.drop_ratio_build.value_or(0.0f);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment where this is defined noting it is now deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.


virtual expected<DocValueComputer<T>>
GetDocValueComputer(const SparseInvertedIndexConfig& cfg) const = 0;

virtual bool
[[nodiscard]] virtual bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this method be removed as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@@ -156,51 +158,67 @@ class InvertedIndex : public BaseInvertedIndex<T> {
*
* 1. size_t rows
* 2. size_t cols
* 3. T value_threshold_
* 3. T value_threshold_ (deprecated)
* 4. for each row:
* 1. size_t len
* 2. for each non-zero value:
* 1. table_t idx
* 2. T val
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. DType val (when QType is different from DType, the QType value of val is stored as a DType with precision loss)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@@ -273,7 +250,7 @@ class SparseInvertedIndexNode : public IndexNode {
return index_or.error();
}
index_ = index_or.value();
return index_->Load(reader);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is for "DeserializeFromFile".

now that we don't have raw data in the index, can we munmap the raw index file after loading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

auto metric = GENERATE(knowhere::metric::IP, knowhere::metric::BM25);

auto [drop_ratio_build, drop_ratio_search] = metric == knowhere::metric::BM25 ? GENERATE(table<float, float>({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove all drop_ratio_build from the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@mergify mergify bot added the ci-passed label Jan 6, 2025
@zhengbuqian
Copy link
Collaborator

/lgtm
/approve


template <typename U>
using Vector = std::conditional_t<mmapped, GrowableVectorView<U>, std::vector<U>>;
std::unordered_map<uint32_t, table_t> dim_map_reverse_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to avoid having this if possible, this basically doubles the memory usage of dim_map.

instead of storing it as a member, can we reconstruct dim_map_reverse_ in Save from dim_map_?

To reduce memory usage, remove the raw data cache in sparse inverted index.
Note that config param `drop_ratio_build` and GetVectorByIds() are also be
removed.

Signed-off-by: Shawn Wang <[email protected]>
To reduce the memory usage of the inverted index, when BM25 metric is used,
quantize term frequency from float to uint16_t. All values that exceeds the
maximum of uint16_t is quantized to the maximum of uint16_t.

Signed-off-by: Shawn Wang <[email protected]>
@zhengbuqian
Copy link
Collaborator

/lgtm
/approve

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sparknack, zhengbuqian

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zhengbuqian
Copy link
Collaborator

/kind feature

Copy link

mergify bot commented Jan 7, 2025

@sparknack Please remove redundant kind/xxx labels and make sure that there is only one kind/xxx label of your Pull Request. (eg. “/remove-kind improvement”)

@sre-ci-robot
Copy link
Collaborator

@zhengbuqian: Those labels are not set on the issue: kind/improvement

In response to this:

/remove-kind improvement

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zhengbuqian
Copy link
Collaborator

/remove-kind enhancement

@zhengbuqian
Copy link
Collaborator

/remove-kind feature
/kind improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants