Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Use conf to control C2R occupied memory #5799

Closed
wants to merge 0 commits into from

Conversation

XinShuoWang
Copy link
Contributor

What changes were proposed in this pull request?

In the current design, the Column2Row operation is completed in one go, which consumes a lot of memory and causes the program OOM. In this commit, I modified the C2R operation into multiple operations, which will greatly reduce the peak memory of the C2R operation. In addition, there should be some performance advantages from memory reuse.

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor

@jinchengchenghh

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@XinShuoWang XinShuoWang marked this pull request as ready for review May 19, 2024 09:17
@XinShuoWang
Copy link
Contributor Author

@jinchengchenghh @zhztheplayer Can you review this pr?

@XinShuoWang XinShuoWang changed the title Use conf to control C2R occupied memory [VL] Use conf to control C2R occupied memory May 19, 2024
Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@xumingming xumingming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, some comments.

cpp/core/operators/c2r/ColumnarToRow.h Outdated Show resolved Hide resolved
auto columnarToRowConverter = ctx->objectStore()->retrieve<ColumnarToRowConverter>(c2rHandle);
auto cb = ctx->objectStore()->retrieve<ColumnarBatch>(batchHandle);
columnarToRowConverter->convert(cb);

int64_t column2RowMemThreshold = 256 * 1024 * 1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you place this config computation to GlutenConfig.cc

Copy link

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor

I met similar issue recently, reports as ArrowContext OOM. Is the root cause that the RowVector too larger? or because the data in each row is too large? By default GLuten configure Velox as 4K batch, but looks many Velox operators may exceed 4K limit.

Does the PR hold on the batch and output part of the batch each time?

@FelixYBW
Copy link
Contributor

We may need couple of UTs to cover this

@XinShuoWang
Copy link
Contributor Author

I met similar issue recently, reports as ArrowContext OOM. Is the root cause that the RowVector too larger? or because the data in each row is too large? By default GLuten configure Velox as 4K batch, but looks many Velox operators may exceed 4K limit.

Does the PR hold on the batch and output part of the batch each time?

  • In my test case, each row is only 20KB in size, but the size of RowVector is very large, which eventually leads to OOM.
  • In my test case the number of output lines exceeds the 4K limit is because of the Generate Operator.
  • This PR is aimed to hold on the batch and output part of the batch each time

@@ -580,17 +583,28 @@ Java_org_apache_gluten_vectorized_NativeColumnarToRowJniWrapper_nativeColumnarTo
JNIEnv* env,
jobject wrapper,
jlong batchHandle,
jlong c2rHandle) {
jlong c2rHandle,
jlong rowId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass a range [startRowId, endRowId) to naitve ? I think it should be more clear to use the config at java side.

@FelixYBW
Copy link
Contributor

FelixYBW commented May 20, 2024

I met similar issue recently, reports as ArrowContext OOM. Is the root cause that the RowVector too larger? or because the data in each row is too large? By default GLuten configure Velox as 4K batch, but looks many Velox operators may exceed 4K limit.
Does the PR hold on the batch and output part of the batch each time?

  • In my test case, each row is only 20KB in size, but the size of RowVector is very large, which eventually leads to OOM.
  • In my test case the number of output lines exceeds the 4K limit is because of the Generate Operator.
  • This PR is aimed to hold on the batch and output part of the batch each time

Can we use the batch size (4K row) by default as the config? We can still add the threshold but if customer doesn't config threshold, let's convert batch size (4k by default) instead of 256M memory.

We can still reuse the memory allocated for row but I don't expect much perf gain here. It's too large for L1/L2 cache, very likely to be evicted from L3 as well when Spark process this row.

Copy link

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor

why the PR is closed?

@XinShuoWang
Copy link
Contributor Author

why the PR is closed?

Sorry, I accidentally forced push the main branch, which caused the PR to be closed. The new PR is here: #5952.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants