-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(c++): label filtering API, benchmarks, and examples #654
Conversation
13fd7d6
to
a25b27b
Compare
@lixueclaire please review this PR for label filtering |
e07cb61
to
7c1fad7
Compare
cpp/benchmarks/benchmark_util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ldbc graph for benchmark label filtering efficiency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 benchmark method:
- single label filtering [1. graphar 2. Acero]
- multi label filtering [1. graphar 2. Acero]
- label filtering based on another filtering result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example shows how to use label filtering and correctness.
@@ -94,6 +100,290 @@ Vertex::Vertex(IdType id, | |||
} | |||
} | |||
|
|||
Result<bool> VertexIter::label(const std::string& label) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
determine whether this vertex have the specfic label
" does not exist in the vertex."); | ||
} | ||
|
||
Result<std::vector<std::string>> VertexIter::label() noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get all labels of this vertex
} | ||
|
||
Result<std::shared_ptr<VerticesCollection>> | ||
VerticesCollection::verticesWithMultipleLabelsbyAcero( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi label filtering API (Acero)
} | ||
|
||
Result<std::shared_ptr<VerticesCollection>> | ||
VerticesCollection::verticesWithMultipleLabels( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi label filtering API, input is VerticesCollection
const std::vector<std::string>& labels, | ||
const bool& is_filtered = false, | ||
const std::vector<IdType>& filtered_ids = {}) noexcept { | ||
if (!labels.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
label_reader is used for reading label chunks
for (const auto& pg : vertex_info->GetPropertyGroups()) { | ||
readers_.emplace_back(vertex_info, pg, prefix); | ||
} | ||
is_filtered_ = is_filtered; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_filtered_ is used for mark the collection/iter is or not a result from filtering
for (const auto& pg : vertex_info->GetPropertyGroups()) { | ||
readers_.emplace_back(vertex_info, pg, prefix); | ||
} | ||
is_filtered_ = is_filtered; | ||
filtered_ids_ = filtered_ids; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the vector of vertex ids after filtering
7c1fad7
to
6093a97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!Thank you for your contribution~
Reason for this PR
Provides API for label filtering
What changes are included in this PR?
New features about label--filtering vertices with specific labels
Are these changes tested?
yes, and efficiency are shown in bechmark, correctness are shown in example
Are there any user-facing changes?
yes, feature as above