-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Use conf to control C2R occupied memory #5952
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/apache/incubator-gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current design, the Column2Row operation is completed in one go, which consumes a lot of memory and causes the program OOM
do you mean current implementation will convert all columnar batches into unsafe rows in one run? it's seems unbelievable, could you double check it and give a case which c2r conversion's peak memory is extremely high?
virtual void | ||
convert(std::shared_ptr<ColumnarBatch> cb = nullptr, int64_t rowId = 0, int64_t memoryThreshold = INT64_MAX) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we configure memoryThreshold
for ColumnarToRowConverter when initializing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhztheplayer can you fix it?
shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala
Outdated
Show resolved
Hide resolved
public native NativeColumnarToRowInfo nativeColumnarToRowConvert(long batchHandle, long c2rHandle) | ||
throws RuntimeException; | ||
public native NativeColumnarToRowInfo nativeColumnarToRowConvert( | ||
long batchHandle, long c2rHandle, long rowId) throws RuntimeException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass memoryThreshold to jni ? It should be easy to track the config at java side.
@XinShuoWang Can you update? |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
2 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
@XinShuoWang can you fix the UT? |
Run Gluten Clickhouse CI |
@XinShuoWang The UT fails as the default buffer size is 64MB and it's too large when task memory is small. You may can change the threshold value for these UTs. |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
@XinShuoWang can you fix the UT? the bug triggered many queries failed in our test. |
Run Gluten Clickhouse CI |
@XinShuoWang Would you continue to fix the issue? If not we can pickup and fix it. The bug blocked our adoption |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Error still occurs at case
|
Run Gluten Clickhouse CI |
I fixed a bit. Now if the batch is small, we needn't alloate 64M memory, but the accurate memory |
Is it related? I can't see |
Run Gluten Clickhouse CI |
What changes were proposed in this pull request?
In the current design, the Column2Row operation is completed in one go, which consumes a lot of memory and causes the program OOM. In this commit, I modified the C2R operation into multiple operations, which will greatly reduce the peak memory of the C2R operation. In addition, there should be some performance advantages from memory reuse.
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)