[GLUTEN-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap #8127

zjuwangg · 2024-12-03T02:28:35Z

What changes were proposed in this pull request?

Implement feature: #7750

How was this patch tested?

Add unit test and manual run tpcds in my dev box.

github-actions · 2024-12-03T02:28:54Z

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Other pull requests

github-actions · 2024-12-03T02:29:07Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-03T09:57:05Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-04T11:52:24Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-11T10:16:49Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-12T03:50:30Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-12T09:00:05Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-12T11:04:58Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-12T12:08:08Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-12T13:10:18Z

Run Gluten Clickhouse CI on x86

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala

github-actions · 2024-12-13T06:47:15Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-13T09:07:35Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-16T13:15:58Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-18T00:48:49Z

#7750

zhouyuan · 2024-12-19T01:49:15Z

CC @surnaik

zhztheplayer

Just roughly went though the PR. Seems in good shape already.

Saw CI is failing BTW.

zhztheplayer · 2024-12-19T01:32:10Z

...x/src/main/scala/org/apache/spark/sql/execution/unsafe/UnsafeColumnarBuildSideRelation.scala

+    val taskMemoryManager = new TaskMemoryManager(
+      new UnifiedMemoryManager(SparkEnv.get.conf, Long.MaxValue, Long.MaxValue / 2, 1),
+      0)


Would you like to explain a bit against this code? Thanks!

This is used in Broadcast, shared by multiple tasks, so here introduce a new TaskMemoryManager instead of use task-related TaskMemoryManager.
Similar to UnsafeHashRelation in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala#L389-L410

Thanks for the explanation. And I need to apologize that my previous comment #7750 (comment) was wrong, because we are apparently facing the similar issue with Gazelle project here again.

I see finalize() is used and I assume it's working in your test, right? And given that invocations on finalize() relies on JVM GC, how do we make sure the off-heap memory can be released to avoid off-heap overhead, even JVM GC is never triggered?

This code (line :127 - :129) looks feasible but the allocation is still not managed by the default memory manager. Do we have a better approach? Otherwise Yarn kill issue will be led.

We can revisit the topic and the code a little bit and hopefully we can come up with an optimal solution. My description in #7750 was indeed a bit misleading, this issue doesn't look to have some kind of "simple" solution so far.

@zhztheplayer Spark uses the GC mechanism of Cleaner to clean up broadcasts. finalize() should also be a feasible solution, but it cannot be actively cleaned up when offheap is insufficient. As for the need for a unified memory manager to manage this memory, we can directly use SparkEnv.get.memoryManager. see https://github.com/apache/incubator-gluten/pull/8349/files#diff-f4bdaf17dd276a8e42e2a1d9b2b865ea8d68c23b05523fae56e900fd78f972e5R151

Edit:

Broadcast variables can usually be assumed to be small enough, and the way of finalize() is roughly acceptable (maybe I'm wrong). Later, we can look for a more suitable solution, such as implementing the spill of UnsafeBytesBufferArray?

@yikf Thank you for continuing the work. I was on vacation either today so sorry for the late reply.

@zhztheplayer Hi, terry is on vacation recently. What's your opinion on this idea? I can continue to follow up.

I was thinking about using storage memory but the implementation is complicated? If so let's put that idea on hold.

we can directly use SparkEnv.get.memoryManager.

👍 Sounds good to me if it can utilize Spark's off-heap memory, since at least the memory is tracked. Though the release could still be a problem? Because Spark doesn't do System.gc() on off-heap OOM.

If the issue does apply, maybe we can consider one of the following:

Option 1. Hack into Spark to make sure System.gc() is triggered on OOM. (if there's an apporach without modifications on vanilla Spark's code)
Option 2. Set a fixed capacity (e.g., 15% of off-heap memory) for the outstanding task memory manager used in the PR. When this part of memory is run out, trigger System.gc(), then if still unsatisified, throw an OOM.

The above are just based on my assumption. Let me know if any other possible solutions.

@zhztheplayer There is another idea whose feasibility is uncertain.

Spark uses the GC mechanism on the Driver side to clean broadcast variables through ContextCleaner and supports the attachListener. We can fully utilize this mechanism.

Prerequisite,

Implement our own set of endpoints for the off-heap broadcast cleanup.

Setup the endpoint if use off-heap broadcast.

UnsafeColumnarBuildSideRelation record the broadcast id.

Implement ContextCleaner listener.

deserialize the UnsafeColumnarBuildSideRelation at executor side, use a singleton class record broadcast id and the UnsafeColumnarBuildSideRelation (read external).

Then,
When the Driver triggers broadcast clearing, the listener catchs the event and sends the clearing request to the executor side using endpoint. The endpoint of the executor uses broadcastId to find the corresponding relation. Actively penalizes the release method to free offheap memory. don't use finalize().

The idea is similar to cleaning the broadcast block on the executor side.

Not sure if it will work, but if it does, this idea seems to be a final solution that fits the current Spark mechanics.

Thank you for the thoughtful insights.

More or less, the idea looks similar to the solution CH backend is using. In case you happend to miss that one, could refer to code starting from here.

Looks like feasible to me though there are 2 limitations coming out from my mind with the solution:

The GC timing of driver-side variable is still uncertain

What if the driver-side variable is GC-ed, while executor-side variable is still in use? (Perhaps, set driver-side variable to null manually, immediately after broadcasted to testify)

#2 does not seem to happen, at least until the end of the current stage, functions on the driver side will refer to broadcast variables, not sure if this is the case.

The gc timing on the driver side is also not controllable, but let's revisit the topic, we need to free up offheap memory when the broadcast variable is not in use, but there doesn't seem to be a certain thing on the executor side to recognize that the broadcast is no longer in use, so we need to trigger it on the executor side via gc, which is also somewhat strange in nature... Whether it's actively triggered gc or triggered by the jvm, and active gc is actually not a good solution in my opinion, it affects a lot of things.

I was thinking about using storage memory but the implementation is complicated? If so let's put that idea on hold.
Using storage memory is a bit more complex and Spark broadcast mechanism basic implements is based on Spark's storage memory in nature. We can try this way later to see if things is ok.

gluten-arrow/src/main/java/org/apache/gluten/vectorized/ColumnarBatchSerializerJniWrapper.java

...x/src/main/scala/org/apache/spark/sql/execution/unsafe/UnsafeColumnarBuildSideRelation.scala

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala

backends-velox/src/test/scala/org/apache/gluten/execution/VeloxTPCHSuite.scala

...ends-velox/src/main/scala/org/apache/spark/sql/execution/unsafe/UnsafeBytesBufferArray.scala

github-actions · 2024-12-23T09:56:43Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-24T08:27:15Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-25T17:24:52Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-25T17:43:45Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-25T17:54:39Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-27T06:02:05Z

Run Gluten Clickhouse CI on x86

lwz9103 · 2024-12-27T06:25:32Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-27T10:23:40Z

Run Gluten Clickhouse CI on x86

zhztheplayer

Basically only the releasing issue remaining, from my perspective. Perhaps we can find a temporary solution then merge this next week. Thanks everyone.

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala

github-actions · 2024-12-28T13:12:22Z

Run Gluten Clickhouse CI on x86

zjuwangg · 2024-12-29T13:55:06Z

Just make the offHeapBroadcastBuildRelation default behavior to false and adress the comments!
Thanks for your review @zhztheplayer @yikf

zhztheplayer

Looks like feasible to merge since the feature is marked with experimental flags.

Thank you for working on this @zjuwangg. Also would you like to open an issue ticket in advance to track on the OOM risk as we mentioned? Thanks.

zjuwangg · 2025-01-01T07:08:51Z

ok，will open the issue later

…

------------------ Original ------------------ From: Hongze Zhang ***@***.***> Date: Wed,Jan 1,2025 2:13 AM To: apache/incubator-gluten ***@***.***> Cc: Terry Wang ***@***.***>, Mention ***@***.***> Subject: Re: [apache/incubator-gluten] [GLUTEN-7750][VL] MoveColumnarBuildSideRelation's memory occupation to Spark off-heap (PR #8127) @zhztheplayer approved this pull request. Looks like feasible to merge since the feature is marked with experimental flags. Thank you for working on this @zjuwangg. Also would you like to open an issue ticket in advance to track on the OOM risk as we mentioned? Thanks. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

github-actions · 2025-01-02T10:15:05Z

Run Gluten Clickhouse CI on x86

…to Spark off-heap

github-actions · 2025-01-02T10:31:56Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-01-02T12:28:45Z

Run Gluten Clickhouse CI on x86

zjuwangg marked this pull request as draft December 3, 2024 02:28

github-actions bot added CORE works for Gluten Core VELOX labels Dec 3, 2024

zjuwangg mentioned this pull request Dec 3, 2024

[VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap #7750

Open

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 48247e5 to 6a0e31f Compare December 4, 2024 11:51

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 6a0e31f to 17c8b5a Compare December 11, 2024 10:16

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 640cf4e to d23b542 Compare December 12, 2024 12:07

zjuwangg marked this pull request as ready for review December 12, 2024 12:07

zjuwangg changed the title ~~[Draft][Gluten-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap~~ [Gluten-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap Dec 12, 2024

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from d23b542 to b3e118f Compare December 12, 2024 13:09

zhouyuan reviewed Dec 12, 2024

View reviewed changes

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala Show resolved Hide resolved

zhztheplayer self-requested a review December 17, 2024 02:29

zhztheplayer changed the title ~~[Gluten-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap~~ [GLUTEN-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap Dec 18, 2024

zhztheplayer reviewed Dec 19, 2024

View reviewed changes

liujiayi771 reviewed Dec 20, 2024

View reviewed changes

...ends-velox/src/main/scala/org/apache/spark/sql/execution/unsafe/UnsafeBytesBufferArray.scala Outdated Show resolved Hide resolved

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from b34bc8c to 18f1c01 Compare December 23, 2024 09:56

zhztheplayer mentioned this pull request Dec 24, 2024

PoC: [VL] API for BroadcastFactory customization via reflection #8315

Closed

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 3845bf4 to d05429b Compare December 25, 2024 17:24

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from d05429b to 1a3d339 Compare December 25, 2024 17:43

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 1a3d339 to 1811c06 Compare December 25, 2024 17:54

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 1811c06 to e526095 Compare December 27, 2024 06:01

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from e526095 to c69940f Compare December 27, 2024 10:23

zhztheplayer reviewed Dec 27, 2024

View reviewed changes

shims/common/src/main/scala/org/apache/gluten/GlutenConfig.scala Outdated Show resolved Hide resolved

zhztheplayer approved these changes Dec 31, 2024

View reviewed changes

zjuwangg added 2 commits January 2, 2025 18:16

[GLUTEN-7750][VL] Move ColumnarBuildSideRelation's memory occupation …

7e20ba0

…to Spark off-heap

address comment and remove unnecessary log

efff600

zjuwangg force-pushed the m_move_broadcast_2_offheap branch from 1c534ce to efff600 Compare January 2, 2025 10:31

Trigger Build

a60f3f5

zhztheplayer merged commit dda601b into apache:main Jan 2, 2025
48 checks passed

zjuwangg deleted the m_move_broadcast_2_offheap branch January 10, 2025 06:19

zjuwangg mentioned this pull request Mar 3, 2025

[DOC] Add doc about experimental feature using off-heap to store broadcast build relation #8882

Merged

[GLUTEN-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap #8127

[GLUTEN-7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap #8127

Conversation

zjuwangg commented Dec 3, 2024 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

github-actions bot commented Dec 3, 2024

github-actions bot commented Dec 3, 2024

github-actions bot commented Dec 3, 2024

github-actions bot commented Dec 4, 2024

github-actions bot commented Dec 11, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Dec 12, 2024

github-actions bot commented Dec 13, 2024

github-actions bot commented Dec 13, 2024

github-actions bot commented Dec 16, 2024

github-actions bot commented Dec 18, 2024

zhouyuan commented Dec 19, 2024

zhztheplayer left a comment

Choose a reason for hiding this comment

zhztheplayer Dec 19, 2024

Choose a reason for hiding this comment

zjuwangg Dec 20, 2024

Choose a reason for hiding this comment

zhztheplayer Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

yikf Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

yikf Dec 26, 2024

Choose a reason for hiding this comment

zhztheplayer Dec 27, 2024

Choose a reason for hiding this comment

yikf Dec 27, 2024 • edited Loading

Choose a reason for hiding this comment

zhztheplayer Dec 27, 2024 • edited Loading

Choose a reason for hiding this comment

yikf Dec 27, 2024

Choose a reason for hiding this comment

zjuwangg Dec 29, 2024

Choose a reason for hiding this comment

github-actions bot commented Dec 23, 2024

github-actions bot commented Dec 24, 2024

github-actions bot commented Dec 25, 2024

github-actions bot commented Dec 25, 2024

github-actions bot commented Dec 25, 2024

github-actions bot commented Dec 27, 2024

lwz9103 commented Dec 27, 2024

github-actions bot commented Dec 27, 2024

zhztheplayer left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 28, 2024

zjuwangg commented Dec 29, 2024

zhztheplayer left a comment

Choose a reason for hiding this comment

zjuwangg commented Jan 1, 2025 via email

github-actions bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

zjuwangg commented Dec 3, 2024 •

edited

Loading

zhztheplayer Dec 23, 2024 •

edited

Loading

yikf Dec 26, 2024 •

edited

Loading

yikf Dec 27, 2024 •

edited

Loading

zhztheplayer Dec 27, 2024 •

edited

Loading