Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-6876] Support Spark-352 #7138

Merged
merged 20 commits into from
Oct 15, 2024
Merged

[GLUTEN-6876] Support Spark-352 #7138

merged 20 commits into from
Oct 15, 2024

Conversation

zhouyuan
Copy link
Contributor

@zhouyuan zhouyuan commented Sep 6, 2024

What changes were proposed in this pull request?

Support Spark 352

fixes #6876

How was this patch tested?

Pass GHA

@github-actions github-actions bot added CORE works for Gluten Core INFRA TOOLS labels Sep 6, 2024
Copy link

github-actions bot commented Sep 6, 2024

#6876

Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

@@ -107,7 +107,7 @@ class ColumnarShuffleManager(conf: SparkConf) extends ShuffleManager with Loggin
metrics,
shuffleExecutorComponents)
case other: BaseShuffleHandle[K @unchecked, V @unchecked, _] =>
new SortShuffleWriter(other, mapId, context, shuffleExecutorComponents)
new SortShuffleWriter(other, mapId, context, _, shuffleExecutorComponents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that gluten will not guarantee spark minor version compatibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wForget,
Not guaranteed for all minor versions indeed. Actually it's the same for Spark 32 & 3.3 & 3.4. It's only tested with the latest minor release.
The issue is mostly due to the shim layer design, and the CI/CD resources are also not enough for so many small versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you.

@zhouyuan zhouyuan marked this pull request as draft September 6, 2024 06:11
Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

p: SparkPlan,
reason: String,
fallbackNodeToReason: mutable.HashMap[String, String]): Unit = {
p.getTagValue(QueryPlan.OP_ID_TAG).foreach {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://issues.apache.org/jira/browse/SPARK-48610

QueryPlan.OP_ID_TAG doesn't exist in spark 3.5.2+ / 4.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, there's also a minor change on sortshufflewriter, will move these changes to shim layer

Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

3 similar comments
Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Sep 6, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Sep 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Oct 4, 2024

Run Gluten Clickhouse CI

2 similar comments
Copy link

github-actions bot commented Oct 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Oct 9, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Oct 9, 2024

Run Gluten Clickhouse CI

@zhouyuan zhouyuan marked this pull request as ready for review October 9, 2024 05:35
@leoluan2009
Copy link
Contributor

Spark 3.5.3 is out https://spark.apache.org/news/spark-3-5-3-released.html, should we upgrade to 3.5.3?

@zhouyuan
Copy link
Contributor Author

zhouyuan commented Oct 9, 2024

Spark 3.5.3 is out https://spark.apache.org/news/spark-3-5-3-released.html, should we upgrade to 3.5.3?

Yes, plan to submit one new patch for Spark-353 after this - actually there's not much change from 352 to 353
here's the discussion before the PRC holiday:
#7336

thanks, -yuan

Copy link

github-actions bot commented Oct 9, 2024

Run Gluten Clickhouse CI

@zhouyuan
Copy link
Contributor Author

@baibaichen @zhztheplayer

@baibaichen
Copy link
Contributor

let me try performance for ch backend,

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@baibaichen baibaichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhouyuan zhouyuan merged commit 32cd1dc into apache:main Oct 15, 2024
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Spark 352
7 participants