Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multi-label] add parameter 'label' for createinfos #634

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

Elssky
Copy link
Contributor

@Elssky Elssky commented Sep 24, 2024

No description provided.

@Elssky

This comment was marked as resolved.

@Elssky Elssky changed the title feat(c++): write label chunks feat(c++): add paramater 'label' for createinfos Oct 23, 2024
@Elssky Elssky changed the title feat(c++): add paramater 'label' for createinfos feat(c++): add parameter 'label' for createinfos Oct 23, 2024
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the createinfo function in exmaples to add the label parameter. If there is no label, use an empty array as a placeholder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test file for multi-label

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RLE coding for compress the space occupied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process arrow::boolean data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the example files

@Elssky Elssky changed the title feat(c++): add parameter 'label' for createinfos feat(c++): [multi-label]add parameter 'label' for createinfos Oct 25, 2024
@Elssky Elssky changed the title feat(c++): [multi-label]add parameter 'label' for createinfos [multi-label] add parameter 'label' for createinfos Oct 25, 2024
@Elssky
Copy link
Contributor Author

Elssky commented Oct 25, 2024

@lixueclaire Please review this Pull Request, which adds label in graph/vertexinfos:)

parquet::WriterProperties::Builder builder;
builder.compression(arrow::Compression::type::ZSTD); // enable compression
for (int i = 0; i < column_num; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that RLE encoding is currently applied to all columns. Could we adjust this so that RLE is used only on the label columns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sure!

@@ -27,6 +27,7 @@ struct GeneralParams {
static constexpr const char* kDstIndexCol = "_graphArDstIndex";
static constexpr const char* kOffsetCol = "_graphArOffset";
static constexpr const char* kPrimaryCol = "_graphArPrimary";
static constexpr const char* kLabelCol = ":LABEL";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we consider renaming the column to start with "_graphAr"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, use ":LABEL" because we want to keep the column names consistent with the csv, but of course we can change the name in the arrow table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the usage example will be commited in next PR

Copy link
Contributor

@lixueclaire lixueclaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. Thanks for your contribution!

@Elssky
Copy link
Contributor Author

Elssky commented Oct 28, 2024

LGTM now. Thanks for your contribution!

Thanks! Please help merge this PR, currently I don’t have write access😂

@lixueclaire lixueclaire merged commit 0961b64 into apache:main Oct 28, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants