Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimized distiance functions for PowerPC #894

Merged
merged 1 commit into from
Nov 14, 2024

Conversation

carll99
Copy link
Contributor

@carll99 carll99 commented Oct 15, 2024

Files src/simd/distances_powerpc.cc, src/simd/distances_powerpc.h are added with powerpc optimized versions of the various functions.

File src/simd/hook.cc is updated to point to the new PowerPC versions of the functions.

@sre-ci-robot sre-ci-robot requested review from hhy3 and PwzXxm October 15, 2024 16:59
@sre-ci-robot
Copy link
Collaborator

Welcome @carll99! It looks like this is your first PR to zilliztech/knowhere 🎉

Copy link

mergify bot commented Oct 15, 2024

@carll99 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@carll99
Copy link
Contributor Author

carll99 commented Oct 15, 2024

Based on the comments I got from generating the PR, it sounds like I am supposed to set the kind label? Not sure how that should have been done. Secondly, this adds specific PowerPC support which I believe makes it kind/feature?

@@ -0,0 +1,540 @@
/**
* Copyright (c) Facebook, Inc. and its affiliates.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Facebook, not Zilliz in a license header?

namespace faiss {

float
fvec_L2sqr_ref_ppc(const float* x, const float* y, size_t d) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please call the function fvec_L2sqr_ppc, not fvec_L2sqr_ref_ppc.
Same for all other functions.


base = (d / FLOAT_VEC_SIZE) * FLOAT_VEC_SIZE;

for (size_t i = 0; i < base; i = i + FLOAT_VEC_SIZE) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (size_t i = 0; i < base; i += FLOAT_VEC_SIZE) {
same comment applies to loops in other functions

/* Vector implmentaion uses vector size of FLOAT_VEC_SIZE. If the input
array size is not a power of FLOAT_VEC_SIZE, do the remaining elements
in scalar mode. */
size_t i;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please get rid of this local variable.
same comment applies to other functions

}

/* Handle any remaining data elements */
for (i = base; i < d; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (size_t i = base; i < d; i++) {
same comment applies to loops in other functions

* LICENSE file in the root directory of this source tree.
*/

#if defined(__powerpc__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to confirm that all knowhere unit tests passed, correct?

@carll99
Copy link
Contributor Author

carll99 commented Oct 16, 2024 via email

@alexanderguzhva
Copy link
Collaborator

@carll99 just force push the new changes to override this PR with a better version

Copy link

codecov bot commented Oct 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.19%. Comparing base (3c46f4c) to head (609931d).
Report is 253 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main     #894       +/-   ##
=========================================
+ Coverage      0   74.19%   +74.19%     
=========================================
  Files         0       82       +82     
  Lines         0     6600     +6600     
=========================================
+ Hits          0     4897     +4897     
- Misses        0     1703     +1703     

see 82 files with indirect coverage changes

@liliu-z
Copy link
Collaborator

liliu-z commented Oct 19, 2024

Hi @carll99
Due to the limited resource, we actually doesn't officially support PowerPC, which means the official release and new features will not take care of the PowerPC specific scenario.
However, we are welcome for the contribution and maintain from the community. Do you mind to share with us more about the use case to help us know more about the importance of PowerPC?

@alexanderguzhva
Copy link
Collaborator

@carll99 please accept DCO and run clang-format and I'll lgtm the PR. Thanks.

@carll99
Copy link
Contributor Author

carll99 commented Oct 21, 2024

I updated the patch and the team is working on retesting it.

I forwarded the comment from liliu-z to the PowerPC and we will work on the response.

The DCO, looks like the description for DCO says the project wants a sign-off line. I will include that with the next version.

I will check how to run the clang-format tool. I did a quick google to see what it is.

Thanks for the feedback.

@carll99
Copy link
Contributor Author

carll99 commented Nov 11, 2024 via email

@alexanderguzhva
Copy link
Collaborator

@carll99 Please carefully read the error message https://github.com/zilliztech/knowhere/pull/894/checks?check_run_id=32817538304, all instructions are there
Also, would you please update a PR to include your changes?
Thanks

Added the PowerPC vector functions in src/simd/distances_powerpc.cc,
src/simd/distances_powerpc.h.  The hooks to the PowerPC functions are
added in src/simd/hook.cc.

Signed-off-by: Carl Love <[email protected]>
@carll99
Copy link
Contributor Author

carll99 commented Nov 13, 2024

I have updated my changes for PowerPC. They have been tested on a Power10 to verify they compile and the results are correct.

Per the questions earlier about the IBM use case for this code.

The IBM Watsonx product uses Milvus.  Milvus uses the Knowhere code.  So IBM is very interested in ensuring the performance of the Knowhere code is optimized for PowerPC.  We have a number of people at IBM working on improving the various AI code bases like Knowhere and FAISS. This is an ongoing project. IBM is putting a number of resources into these projects. They are all fundamental to the continued development of the IBM Watsonx product. I hope that helps answer your question.

Finally, sorry about the slow response getting the updated code. We had a number of our lab machines fail. They were down for a couple of weeks. Also, the testing and verification of the patch took a lot longer than expected. My apologies for the slow turn around on the changes.

@mergify mergify bot added the ci-passed label Nov 13, 2024
@alexanderguzhva
Copy link
Collaborator

/lgtm

@foxspy
Copy link
Collaborator

foxspy commented Nov 14, 2024

/kind improvement

@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carll99, foxspy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit d3605fb into zilliztech:main Nov 14, 2024
13 of 14 checks passed
foxspy pushed a commit to foxspy/knowhere that referenced this pull request Nov 18, 2024
Added the PowerPC vector functions in src/simd/distances_powerpc.cc,
src/simd/distances_powerpc.h.  The hooks to the PowerPC functions are
added in src/simd/hook.cc.

Signed-off-by: Carl Love <[email protected]>
Co-authored-by: Carl Love <[email protected]>
foxspy pushed a commit to foxspy/knowhere that referenced this pull request Nov 18, 2024
Added the PowerPC vector functions in src/simd/distances_powerpc.cc,
src/simd/distances_powerpc.h.  The hooks to the PowerPC functions are
added in src/simd/hook.cc.

Signed-off-by: Carl Love <[email protected]>
Co-authored-by: Carl Love <[email protected]>
Signed-off-by: xianliang.li <[email protected]>
sre-ci-robot pushed a commit that referenced this pull request Nov 18, 2024
* update raft to 24.10 (#914)

Signed-off-by: yusheng.ma <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* fix Index parameters handling and anniterator (#913)

Signed-off-by: Alexandr Guzhva <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* add range check (#915)

Signed-off-by: xianliang.li <[email protected]>

* fix knowhere ut (#918)

Signed-off-by: xianliang.li <[email protected]>

* raft index supports cosine similarity by normalizing the input data. (#924)

Signed-off-by: yusheng.ma <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* compensate for the missing acceleration functions in ARM NEON. (#922)

Signed-off-by: yusheng.ma <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* improve sparse vector index mmap: to mmap almost everything (#928)

Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* move sparse index Add to build pool (#933)

Add SparseInvertedIndexNodeCC to allow being thread safe growing index

Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* sparse mmap on disk (#935)

Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* use MAP_PRIVATE for mmapped file (#938)

Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* sparse RangeSearch/AnnIterator to return raw distance (#944)

* sparse: make the distance returned by RangeSearch and AnnIterator to be the raw instead of the quantized distance

Signed-off-by: Buqian Zheng <[email protected]>

* sparse: remove mutex in the index: we now use CC index if concurrent read/write is needed

Signed-off-by: Buqian Zheng <[email protected]>

---------

Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* Add optimized distance functions for PowerPC (#894)

Added the PowerPC vector functions in src/simd/distances_powerpc.cc,
src/simd/distances_powerpc.h.  The hooks to the PowerPC functions are
added in src/simd/hook.cc.

Signed-off-by: Carl Love <[email protected]>
Co-authored-by: Carl Love <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

* enhance: optimize get norms function (#950)

Signed-off-by: cqy123456 <[email protected]>
Signed-off-by: xianliang.li <[email protected]>

---------

Signed-off-by: yusheng.ma <[email protected]>
Signed-off-by: xianliang.li <[email protected]>
Signed-off-by: Alexandr Guzhva <[email protected]>
Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: Carl Love <[email protected]>
Signed-off-by: cqy123456 <[email protected]>
Co-authored-by: presburger <[email protected]>
Co-authored-by: Alexander Guzhva <[email protected]>
Co-authored-by: Buqian Zheng <[email protected]>
Co-authored-by: carll99 <[email protected]>
Co-authored-by: Carl Love <[email protected]>
Co-authored-by: cqy123456 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants