Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNM #5632

Closed
wants to merge 20 commits into from
Closed

DNM #5632

wants to merge 20 commits into from

Conversation

acvictor
Copy link
Contributor

@acvictor acvictor commented May 7, 2024

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #ISSUE-ID)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Copy link

github-actions bot commented May 7, 2024

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented May 7, 2024

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor

FelixYBW commented May 7, 2024

Can you comment a bit how these info are used by Velox? Is it aligned with Spark's behavior?

@gaoyangxiaozhu
Copy link
Contributor

can't we leverage metadatacolumns for this since it already contains fileSize and modifyTime ?

@acvictor
Copy link
Contributor Author

can't we leverage metadatacolumns for this since it already contains fileSize and modifyTime ?

Does metadatacolumns always contain file size and modification time? I understood it as containing these only if they are being queried. In this case I want it to always be passed in the split.

Copy link

github-actions bot commented Jun 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 6, 2024

Run Gluten Clickhouse CI

@acvictor
Copy link
Contributor Author

acvictor commented Jun 6, 2024

Can you comment a bit how these info are used by Velox? Is it aligned with Spark's behavior?

In Spark, properties are present in the split and obtained during listing. Earlier Velox would make one additional call to remote storage to fetch file length per openFileForRead call on a path even though this information is already present upstream wihle constructing the split. Velox now allows these values to be passed from the caller. By making this change we can eliminate one additional call to remote storage per path (RTT is in the order of 10s of milli seconds).

Copy link

github-actions bot commented Jun 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 7, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 8, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 8, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 8, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 8, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jun 9, 2024

Run Gluten Clickhouse CI

@acvictor acvictor changed the title [VL] Pass file size and modification time to split DNM Jun 10, 2024
Copy link

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the stale stale label Jul 26, 2024
Copy link

github-actions bot commented Aug 5, 2024

This PR was auto-closed because it has been stalled for 10 days with no activity. Please feel free to reopen if it is still valid. Thanks.

@github-actions github-actions bot closed this Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants