Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Consider the cost when applying stage fallback policy #3569

Merged
merged 5 commits into from
Nov 2, 2023

Conversation

PHILO-HE
Copy link
Contributor

What changes were proposed in this pull request?

When making a stage fallback, an extra ColumnarToRow may be required to adapt to the columnar output of last query stage(s). So this cost may need to be considered to get a net fallback number.

How was this patch tested?

An existing UT covers it.

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@github-actions
Copy link

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Oct 31, 2023

Hi @ulysses-you, motivated by your previous work, this PR is proposed to evaluate the cost when applying stage fallback policy. Could you please help review to see if it makes sense in practice? Thanks!

@PHILO-HE PHILO-HE force-pushed the optimize-fallback-policy branch from 8dfd6cf to b7af6de Compare October 31, 2023 01:00
@github-actions
Copy link

Run Gluten Clickhouse CI

@PHILO-HE PHILO-HE force-pushed the optimize-fallback-policy branch from b7af6de to df8f871 Compare October 31, 2023 01:02
@github-actions
Copy link

Run Gluten Clickhouse CI

@@ -175,11 +190,15 @@ case class ExpandFallbackPolicy(isAdaptiveContext: Boolean, originalPlan: SparkP
return None
}

val numFallback = countFallbacks(plan)
if (numFallback >= fallbackThreshold) {
val netFallbackNum = if (isAdaptiveContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the net do you mean final ? Does final sound more friendly ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the net means "净值" to reflect stage fallback's benefit with cost deducted. I am afraid final may look like the fallback num at last after policy is applied. Do you think so?

Copy link
Contributor

@ulysses-you ulysses-you Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the explanation, I'm fine with net

@github-actions
Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @PHILO-HE , looks good to me except one comment

.foreach {
case stage: QueryStageExec
if stage.plan.isInstanceOf[GlutenPlan] ||
InMemoryTableScanHelper.isGlutenTableCache(stage) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also check it at line 143

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment! At line 143, we only find gluten plan whose one or more child is last stage. It seems this gluten plan cannot be gluten table cache since it's a leaf node without child. Right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just consider such case if the leaf node is Gluten TableCache, for example:

    GlutenTableCache
           |
HashAggregateTrasnformer
           |
HashAggregateTrasnformer   
           |
     ColumnarToRow 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example, as GlutenTableCache cannot be connected to last stage, we don't need to consider two-stage's transition cost (ColumnarToRow) when falling back the current stage. Am I missing something?

Copy link

github-actions bot commented Nov 1, 2023

Run Gluten Clickhouse CI

@PHILO-HE PHILO-HE force-pushed the optimize-fallback-policy branch from 37969e9 to 9fd398a Compare November 1, 2023 07:35
Copy link

github-actions bot commented Nov 1, 2023

Run Gluten Clickhouse CI

@PHILO-HE PHILO-HE merged commit 81fae1e into apache:main Nov 2, 2023
16 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3569_time.csv log/native_master_11_01_2023_697b2bda2_time.csv difference percentage
q1 34.42 34.18 -0.245 99.29%
q2 24.69 24.84 0.154 100.62%
q3 40.61 40.06 -0.551 98.64%
q4 35.45 37.29 1.833 105.17%
q5 70.32 71.36 1.043 101.48%
q6 8.89 8.95 0.054 100.60%
q7 87.93 86.23 -1.702 98.07%
q8 86.57 84.56 -2.011 97.68%
q9 121.76 121.30 -0.463 99.62%
q10 51.71 50.93 -0.778 98.50%
q11 19.92 20.30 0.387 101.94%
q12 27.41 26.50 -0.906 96.69%
q13 49.62 48.85 -0.771 98.45%
q14 17.97 18.13 0.162 100.90%
q15 35.13 35.06 -0.068 99.81%
q16 16.39 16.16 -0.230 98.60%
q17 102.74 102.07 -0.671 99.35%
q18 146.13 149.11 2.976 102.04%
q19 16.69 18.26 1.565 109.37%
q20 32.33 31.06 -1.270 96.07%
q21 226.28 223.74 -2.543 98.88%
q22 13.33 13.21 -0.129 99.03%
total 1266.32 1262.16 -4.166 99.67%

ulysses-you pushed a commit to ulysses-you/gluten that referenced this pull request Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants