Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add DAAT MaxScore support for sparse vector #1015

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sparknack
Copy link
Contributor

@sparknack sparknack commented Jan 7, 2025

  1. Add DAAT MaxScore support for sparse vector, and set it as default.
  2. Remove use_wand and introduce a new enum InvertedIndexAlgo to support different search algorithm.
  3. Refactor cursor operations used by both WAND and MaxScore.

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sparknack
To complete the pull request process, please assign zhengbuqian after the PR has been reviewed.
You can assign the PR to them by writing /assign @zhengbuqian in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

mergify bot commented Jan 7, 2025

@sparknack 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.08%. Comparing base (3c46f4c) to head (7f5f76e).
Report is 284 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main    #1015       +/-   ##
=========================================
+ Coverage      0   74.08%   +74.08%     
=========================================
  Files         0       82       +82     
  Lines         0     7042     +7042     
=========================================
+ Hits          0     5217     +5217     
- Misses        0     1825     +1825     

see 82 files with indirect coverage changes

@mergify mergify bot added the ci-passed label Jan 7, 2025
@mergify mergify bot removed the ci-passed label Jan 8, 2025
@mergify mergify bot added the ci-passed label Jan 8, 2025
@@ -34,6 +34,13 @@
#include "knowhere/utils.h"

namespace knowhere::sparse {

enum InvertedIndexAlgo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend to use enum class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

template <typename R>
static void
readBinaryString(R& in, std::string& str) {
in.read((char*)str.data(), str.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A size field should be reserved, and the size should be written first, and then the data; deserialization corresponds to it; it is easy to make mistakes if you rely on the size of the input string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code has been removed since we want to redesign the file header later.

@mergify mergify bot removed the ci-passed label Jan 8, 2025
@sparknack sparknack force-pushed the sparse-new-algo branch 2 times, most recently from fbb19ae to 5ad6bde Compare January 9, 2025 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants