memory usage optimization for sparse vector #1011

sparknack · 2024-12-30T08:18:48Z

This patch series adds two memory optimization methods:

Remove the raw data cache in sparse inverted index. Note that config param drop_ratio_build and GetVectorByIds() are also removed.
Force quantize float to uint16_t for BM25 inverted index.

mergify · 2024-12-30T08:19:30Z

@sparknack 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

If you're fixing a bug, label it as kind/bug.
For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

codecov · 2024-12-30T09:30:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.91%. Comparing base (3c46f4c) to head (683f40a).
Report is 283 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##           main    #1011       +/-   ##
=========================================
+ Coverage      0   73.91%   +73.91%     
=========================================
  Files         0       82       +82     
  Lines         0     6981     +6981     
=========================================
+ Hits          0     5160     +5160     
- Misses        0     1821     +1821

see 82 files with indirect coverage changes

sparknack · 2024-12-30T09:40:03Z

/kind enhancement

sparknack · 2024-12-30T10:15:04Z

issue: #967

alexanderguzhva · 2024-12-30T16:29:54Z

include/knowhere/sparse_utils.h

+
+ private:
+    std::vector<table_t> docids_;
+    mutable size_t pos_ = 0;


this is a relatively untypical solution. The more traditional solution would be make pos_ non-mutable and remove constness from test() in order to remove the confusion. Maybe, even change test() name, because it is not really test(). Or one may remove const-ness in a Cursor (see a comment below)

Would you please elaborate the reason for such an unusual implementation?

alexanderguzhva · 2024-12-30T16:38:18Z

src/index/sparse/sparse_inverted_index.h

        size_t loc_ = 0;
        size_t total_num_vec_ = 0;
        float max_score_ = 0.0f;
        float q_value_ = 0.0f;
-        const BitsetView bitset_;
+        const DocIdFilter filter_;


what's the reason for making this const?

class DocIdFilterByVector is actually another form of BitsetView, so the template typename DocIdFilter could be BitsetView or DocIdFilterByVector.

knowhere::sparse::InvertedIndex::Search() takes a const BitsetView& bitset as an argument. And Cursor will use this argument to iterate through the entire posting list.

To keep consistency, the type of DocIdFilter also needs to be a const type.

alexanderguzhva · 2024-12-31T13:27:22Z

@sparknack Please consider the following change (I was able to compile knowhere with unit tests):

diff --git a/include/knowhere/sparse_utils.h b/include/knowhere/sparse_utils.h
index ddb7c3cd..56fbfc70 100644
--- a/include/knowhere/sparse_utils.h
+++ b/include/knowhere/sparse_utils.h
@@ -67,7 +67,7 @@ class DocIdFilterByVector {
     }
 
     [[nodiscard]] bool
-    test(const table_t id) const {
+    test(const table_t id) {
         // find the first id that is greater than or equal to the specific id
         while (pos_ < docids_.size() && docids_[pos_] < id) {
             ++pos_;
@@ -82,7 +82,7 @@ class DocIdFilterByVector {
 
  private:
     std::vector<table_t> docids_;
-    mutable size_t pos_ = 0;
+    size_t pos_ = 0;
 };
 
 template <typename T>
diff --git a/src/index/sparse/sparse_inverted_index.h b/src/index/sparse/sparse_inverted_index.h
index d470cc65..a5a9975a 100644
--- a/src/index/sparse/sparse_inverted_index.h
+++ b/src/index/sparse/sparse_inverted_index.h
@@ -608,7 +608,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
         size_t total_num_vec_ = 0;
         float max_score_ = 0.0f;
         float q_value_ = 0.0f;
-        const DocIdFilter filter_;
+        DocIdFilter filter_;
         table_t cur_vec_id_ = 0;
 
      private:
@@ -631,7 +631,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
     template <typename DocIdFilter>
     void
     search_brute_force(const SparseRow<DType>& q_vec, DType q_threshold, MaxMinHeap<float>& heap,
-                       const DocIdFilter& filter, const DocValueComputer<float>& computer) const {
+                       DocIdFilter& filter, const DocValueComputer<float>& computer) const {
         auto scores = compute_all_distances(q_vec, q_threshold, computer);
         for (size_t i = 0; i < n_rows_internal_; ++i) {
             if ((filter.empty() || !filter.test(i)) && scores[i] != 0) {

I still insist that const guarantees should not be violated and mutable is a last resort resolution. The code is expected to clearly express the way it works :) And it is a caller's responsibility to use const cast in this case

sparknack · 2024-12-31T14:25:00Z

Sorry, I accidentally closed this. I will redo it again later.

sparknack · 2024-12-31T14:48:01Z

@alexanderguzhva Thanks for your demo change. I will do it in the next pr :)

sparknack · 2024-12-31T15:45:17Z

diff --git a/include/knowhere/sparse_utils.h b/include/knowhere/sparse_utils.h
index ddb7c3cd..56fbfc70 100644
--- a/include/knowhere/sparse_utils.h
+++ b/include/knowhere/sparse_utils.h
@@ -67,7 +67,7 @@ class DocIdFilterByVector {
     }
 
     [[nodiscard]] bool
-    test(const table_t id) const {
+    test(const table_t id) {
         // find the first id that is greater than or equal to the specific id
         while (pos_ < docids_.size() && docids_[pos_] < id) {
             ++pos_;
@@ -82,7 +82,7 @@ class DocIdFilterByVector {
 
  private:
     std::vector<table_t> docids_;
-    mutable size_t pos_ = 0;
+    size_t pos_ = 0;
 };
 
 template <typename T>
diff --git a/src/index/sparse/sparse_inverted_index.h b/src/index/sparse/sparse_inverted_index.h
index d470cc65..a5a9975a 100644
--- a/src/index/sparse/sparse_inverted_index.h
+++ b/src/index/sparse/sparse_inverted_index.h
@@ -608,7 +608,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
         size_t total_num_vec_ = 0;
         float max_score_ = 0.0f;
         float q_value_ = 0.0f;
-        const DocIdFilter filter_;
+        DocIdFilter filter_;
         table_t cur_vec_id_ = 0;
 
      private:
@@ -631,7 +631,7 @@ class InvertedIndex : public BaseInvertedIndex<DType> {
     template <typename DocIdFilter>
     void
     search_brute_force(const SparseRow<DType>& q_vec, DType q_threshold, MaxMinHeap<float>& heap,
-                       const DocIdFilter& filter, const DocValueComputer<float>& computer) const {
+                       DocIdFilter& filter, const DocValueComputer<float>& computer) const {
         auto scores = compute_all_distances(q_vec, q_threshold, computer);
         for (size_t i = 0; i < n_rows_internal_; ++i) {
             if ((filter.empty() || !filter.test(i)) && scores[i] != 0) {

@alexanderguzhva
And I still have one question: after your change, knowhere::sparse::InvertedIndex::Search() uses a const BitsetView& bitset while knowhere::sparse::InvertedIndex::search_brute_force() uses a non-const one. So is there an implicit const cast when calling knowhere::sparse::InvertedIndex::search_brute_force()? If so, isn't this implicit conversion dangerous? (I am not familiar with the C++ mechanism, maybe there is a reason.)

zhengbuqian

thanks a lot for the effort! this is awesome! this PR basically looks good to me, with some nit comments.

zhengbuqian · 2025-01-03T02:20:00Z

include/knowhere/sparse_utils.h

@@ -59,6 +60,31 @@ GetDocValueBM25Computer(float k1, float b, float avgdl) {
    };
 }

+class DocIdFilterByVector {


add comment to note that all ids to be tested must be tested exactly once and in order

zhengbuqian · 2025-01-03T02:25:52Z

src/index/sparse/sparse_index_node.cc

@@ -54,14 +53,12 @@ class SparseInvertedIndexNode : public IndexNode {
            LOG_KNOWHERE_ERROR_ << Type() << " only support metric_type IP or BM25";
            return Status::invalid_metric_type;
        }
-        auto drop_ratio_build = cfg.drop_ratio_build.value_or(0.0f);


add a comment where this is defined noting it is now deprecated.

zhengbuqian · 2025-01-03T02:33:30Z

src/index/sparse/sparse_inverted_index.h


    virtual expected<DocValueComputer<T>>
    GetDocValueComputer(const SparseInvertedIndexConfig& cfg) const = 0;

-    virtual bool
+    [[nodiscard]] virtual bool


can this method be removed as well?

zhengbuqian · 2025-01-03T02:45:44Z

src/index/sparse/sparse_inverted_index.h

@@ -156,51 +158,67 @@ class InvertedIndex : public BaseInvertedIndex<T> {
         *
         * 1. size_t rows
         * 2. size_t cols
-         * 3. T value_threshold_
+         * 3. T value_threshold_ (deprecated)
         * 4. for each row:
         *     1. size_t len
         *     2. for each non-zero value:
         *        1. table_t idx
         *        2. T val


2. DType val (when QType is different from DType, the QType value of val is stored as a DType with precision loss)

zhengbuqian · 2025-01-03T02:49:36Z

src/index/sparse/sparse_index_node.cc

@@ -273,7 +250,7 @@ class SparseInvertedIndexNode : public IndexNode {
            return index_or.error();
        }
        index_ = index_or.value();
-        return index_->Load(reader);


this comment is for "DeserializeFromFile".

now that we don't have raw data in the index, can we munmap the raw index file after loading?

zhengbuqian · 2025-01-03T03:09:43Z

tests/ut/test_sparse.cc

    auto metric = GENERATE(knowhere::metric::IP, knowhere::metric::BM25);
+
+    auto [drop_ratio_build, drop_ratio_search] = metric == knowhere::metric::BM25 ? GENERATE(table<float, float>({


we can remove all drop_ratio_build from the tests?

zhengbuqian · 2025-01-06T10:04:37Z

/lgtm
/approve

zhengbuqian · 2025-01-06T10:52:57Z

src/index/sparse/sparse_inverted_index.h

-
-    template <typename U>
-    using Vector = std::conditional_t<mmapped, GrowableVectorView<U>, std::vector<U>>;
+    std::unordered_map<uint32_t, table_t> dim_map_reverse_;


we want to avoid having this if possible, this basically doubles the memory usage of dim_map.

instead of storing it as a member, can we reconstruct dim_map_reverse_ in Save from dim_map_?

To reduce memory usage, remove the raw data cache in sparse inverted index. Note that config param `drop_ratio_build` and GetVectorByIds() are also be removed. Signed-off-by: Shawn Wang <[email protected]>

To reduce the memory usage of the inverted index, when BM25 metric is used, quantize term frequency from float to uint16_t. All values that exceeds the maximum of uint16_t is quantized to the maximum of uint16_t. Signed-off-by: Shawn Wang <[email protected]>

zhengbuqian · 2025-01-07T02:13:10Z

/lgtm
/approve

sre-ci-robot · 2025-01-07T02:13:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sparknack, zhengbuqian

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [zhengbuqian]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

zhengbuqian · 2025-01-07T02:29:41Z

/kind feature

mergify · 2025-01-07T02:30:18Z

@sparknack Please remove redundant kind/xxx labels and make sure that there is only one kind/xxx label of your Pull Request. (eg. “/remove-kind improvement”)

sre-ci-robot · 2025-01-07T02:31:25Z

@zhengbuqian: Those labels are not set on the issue: kind/improvement

In response to this:

/remove-kind improvement

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

zhengbuqian · 2025-01-07T02:31:59Z

/remove-kind enhancement

zhengbuqian · 2025-01-07T02:36:51Z

/remove-kind feature
/kind improvement

sre-ci-robot requested review from foxspy and PwzXxm December 30, 2024 08:18

sre-ci-robot added the size/XL label Dec 30, 2024

mergify bot added the dco-passed label Dec 30, 2024

mergify bot added the do-not-merge/missing-related-issue label Dec 30, 2024

sre-ci-robot added the kind/enhancement label Dec 30, 2024

sparknack force-pushed the sparse-mem-opt branch 2 times, most recently from e21d565 to cb64ab4 Compare December 30, 2024 10:07

alexanderguzhva reviewed Dec 30, 2024

View reviewed changes

sparknack force-pushed the sparse-mem-opt branch from cb64ab4 to 95054e3 Compare December 31, 2024 07:02

mergify bot added the ci-passed label Dec 31, 2024

sparknack closed this Dec 31, 2024

sparknack reopened this Dec 31, 2024

mergify bot removed the ci-passed label Dec 31, 2024

mergify bot added the ci-passed label Dec 31, 2024

sparknack force-pushed the sparse-mem-opt branch from 95054e3 to c83158a Compare January 2, 2025 09:02

mergify bot removed the ci-passed label Jan 2, 2025

sparknack force-pushed the sparse-mem-opt branch from c83158a to ff00223 Compare January 2, 2025 10:23

mergify bot added the ci-passed label Jan 2, 2025

zhengbuqian reviewed Jan 3, 2025

View reviewed changes

sparknack force-pushed the sparse-mem-opt branch from ff00223 to c78564e Compare January 3, 2025 06:35

mergify bot removed the ci-passed label Jan 3, 2025

mergify bot added the ci-passed label Jan 6, 2025

sre-ci-robot added lgtm approved labels Jan 6, 2025

zhengbuqian reviewed Jan 6, 2025

View reviewed changes

sparse: remove raw data cache in sparse inverted index

4d91a1a

To reduce memory usage, remove the raw data cache in sparse inverted index. Note that config param `drop_ratio_build` and GetVectorByIds() are also be removed. Signed-off-by: Shawn Wang <[email protected]>

sparknack force-pushed the sparse-mem-opt branch from bb5ec9a to b929c83 Compare January 6, 2025 16:01

sre-ci-robot removed the lgtm label Jan 6, 2025

mergify bot removed the ci-passed label Jan 6, 2025

sparknack force-pushed the sparse-mem-opt branch from b929c83 to 683f40a Compare January 6, 2025 16:22

mergify bot added the ci-passed label Jan 6, 2025

sre-ci-robot added the lgtm label Jan 7, 2025

sre-ci-robot added the kind/feature label Jan 7, 2025

mergify bot added the do-not-merge/description-tag-conflict label Jan 7, 2025

sre-ci-robot removed the kind/enhancement label Jan 7, 2025

mergify bot removed the do-not-merge/description-tag-conflict label Jan 7, 2025

sre-ci-robot added kind/improvement and removed kind/feature labels Jan 7, 2025

mergify bot removed the do-not-merge/missing-related-issue label Jan 7, 2025

sre-ci-robot merged commit 8380a96 into zilliztech:main Jan 7, 2025
14 checks passed

This was referenced Jan 9, 2025

sparse: accelerate the writing of index files #1017

Merged

sparse: readd GetVectorByIds support for growing segments with IP metric #1022

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory usage optimization for sparse vector #1011

memory usage optimization for sparse vector #1011

sparknack commented Dec 30, 2024 •

edited

Loading

mergify bot commented Dec 30, 2024

codecov bot commented Dec 30, 2024 •

edited

Loading

sparknack commented Dec 30, 2024

sparknack commented Dec 30, 2024

alexanderguzhva Dec 30, 2024

alexanderguzhva Dec 30, 2024

sparknack Dec 31, 2024

alexanderguzhva commented Dec 31, 2024 •

edited

Loading

sparknack commented Dec 31, 2024

sparknack commented Dec 31, 2024

sparknack commented Dec 31, 2024

zhengbuqian left a comment

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian Jan 3, 2025

sparknack Jan 3, 2025

zhengbuqian commented Jan 6, 2025

zhengbuqian Jan 6, 2025

zhengbuqian commented Jan 7, 2025

sre-ci-robot commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

mergify bot commented Jan 7, 2025

sre-ci-robot commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

		auto metric = GENERATE(knowhere::metric::IP, knowhere::metric::BM25);

		auto [drop_ratio_build, drop_ratio_search] = metric == knowhere::metric::BM25 ? GENERATE(table<float, float>({

memory usage optimization for sparse vector #1011

memory usage optimization for sparse vector #1011

Conversation

sparknack commented Dec 30, 2024 • edited Loading

mergify bot commented Dec 30, 2024

codecov bot commented Dec 30, 2024 • edited Loading

Codecov Report

sparknack commented Dec 30, 2024

sparknack commented Dec 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexanderguzhva commented Dec 31, 2024 • edited Loading

sparknack commented Dec 31, 2024

sparknack commented Dec 31, 2024

sparknack commented Dec 31, 2024

zhengbuqian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhengbuqian commented Jan 6, 2025

Choose a reason for hiding this comment

zhengbuqian commented Jan 7, 2025

sre-ci-robot commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

mergify bot commented Jan 7, 2025

sre-ci-robot commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

zhengbuqian commented Jan 7, 2025

sparknack commented Dec 30, 2024 •

edited

Loading

codecov bot commented Dec 30, 2024 •

edited

Loading

alexanderguzhva commented Dec 31, 2024 •

edited

Loading