Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Set s.g.s.c.b.v.coalesceBatchesBeforeShuffle=true by default #6056

Merged
merged 10 commits into from
Jun 18, 2024

Conversation

zhztheplayer
Copy link
Member

@zhztheplayer zhztheplayer commented Jun 12, 2024

Related to #6009

Set spark.gluten.sql.columnar.backend.velox.coalesceBatchesBeforeShuffle=true to make sure shuffle's performance is always not impacted by the issue.

The default batch size is changed to 0.8 * GLUTEN_MAX_BATCH_SIZE to add some flexibilities to avoid unexpected combinations from buffers that are already large enough.

@apache apache deleted a comment from github-actions bot Jun 14, 2024
@apache apache deleted a comment from github-actions bot Jun 14, 2024
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@zhztheplayer zhztheplayer force-pushed the wip-coalesce-default branch from df1138e to 36ecb49 Compare June 14, 2024 06:24
Copy link

Run Gluten Clickhouse CI

2 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Member Author

/Benchmark TPCDS

1 similar comment
@zhztheplayer
Copy link
Member Author

/Benchmark TPCDS

Copy link

Run Gluten Clickhouse CI

@zhztheplayer
Copy link
Member Author

/Benchmark Velox TPCDS

@zhztheplayer zhztheplayer marked this pull request as ready for review June 17, 2024 00:48
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCDS SF2000 with Velox backend, for reference only ====

query log/native_6056_time.csv log/native_master_06_16_2024_a08a57c61f_time.csv difference percentage
q1 14.66 14.69 0.024 100.17%
q2 16.46 15.18 -1.281 92.22%
q3 4.33 4.91 0.575 113.27%
q4 63.86 66.82 2.963 104.64%
q5 6.62 8.33 1.709 125.80%
q6 3.45 2.16 -1.283 62.77%
q7 6.02 6.30 0.273 104.53%
q8 5.63 5.96 0.329 105.85%
q9 19.22 17.57 -1.654 91.40%
q10 11.33 11.03 -0.298 97.37%
q11 38.91 35.51 -3.394 91.28%
q12 1.50 1.44 -0.058 96.16%
q13 5.46 5.31 -0.145 97.35%
q14a 45.55 43.42 -2.136 95.31%
q14b 41.52 38.81 -2.713 93.47%
q15 2.71 3.85 1.145 142.30%
q16 39.45 42.00 2.559 106.49%
q17 4.86 4.73 -0.126 97.41%
q18 6.33 6.16 -0.165 97.39%
q19 2.20 3.59 1.399 163.72%
q20 1.47 2.71 1.242 184.47%
q21 1.03 1.35 0.320 131.01%
q22 10.91 8.13 -2.778 74.54%
q23a 80.98 82.95 1.969 102.43%
q23b 103.32 100.23 -3.089 97.01%
q24a 75.42 70.78 -4.636 93.85%
q24b 69.77 76.15 6.372 109.13%
q25 4.29 6.22 1.926 144.90%
q26 4.29 4.83 0.538 112.53%
q27 3.40 3.25 -0.149 95.63%
q28 20.92 21.89 0.975 104.66%
q29 8.29 7.91 -0.383 95.38%
q30 8.33 4.35 -3.980 52.22%
q31 6.45 6.50 0.054 100.84%
q32 1.31 1.11 -0.200 84.74%
q33 7.02 7.95 0.929 113.23%
q34 4.42 5.74 1.322 129.89%
q35 7.51 7.07 -0.433 94.23%
q36 3.82 3.56 -0.267 93.01%
q37 5.11 4.21 -0.894 82.51%
q38 11.75 12.83 1.073 109.13%
q39a 3.28 3.18 -0.099 96.98%
q39b 2.75 2.83 0.083 103.02%
q40 8.61 6.35 -2.259 73.75%
q41 0.70 0.61 -0.092 86.97%
q42 2.75 1.10 -1.648 40.03%
q43 3.87 3.90 0.022 100.58%
q44 10.42 13.59 3.170 130.42%
q45 5.63 4.16 -1.467 73.92%
q46 3.31 3.39 0.080 102.43%
q47 14.32 14.38 0.061 100.43%
q48 4.63 4.63 -0.001 99.99%
q49 9.50 9.65 0.152 101.60%
q50 27.59 24.71 -2.881 89.56%
q51 8.73 8.97 0.237 102.72%
q52 0.99 1.11 0.120 112.16%
q53 1.98 2.99 1.014 151.24%
q54 3.28 3.36 0.077 102.34%
q55 1.04 1.17 0.122 111.72%
q56 4.36 4.57 0.209 104.79%
q57 8.68 9.20 0.514 105.92%
q58 2.47 2.66 0.187 107.58%
q59 15.86 14.06 -1.803 88.64%
q60 4.68 4.64 -0.036 99.22%
q61 5.44 5.22 -0.218 95.99%
q62 4.41 6.60 2.195 149.81%
q63 2.23 2.10 -0.122 94.54%
q64 53.49 53.36 -0.129 99.76%
q65 13.83 16.93 3.096 122.38%
q66 3.45 3.16 -0.289 91.61%
q67 377.08 363.54 -13.536 96.41%
q68 3.54 3.72 0.181 105.11%
q69 7.48 9.96 2.480 133.16%
q70 8.95 8.93 -0.029 99.68%
q71 2.63 2.57 -0.059 97.74%
q72 188.52 192.42 3.896 102.07%
q73 2.20 2.26 0.052 102.38%
q74 21.44 21.15 -0.293 98.63%
q75 27.37 24.94 -2.434 91.11%
q76 10.33 12.49 2.165 120.96%
q77 2.22 2.33 0.113 105.07%
q78 43.34 42.10 -1.239 97.14%
q79 3.54 3.64 0.099 102.79%
q80 12.91 15.73 2.819 121.84%
q81 5.38 5.14 -0.235 95.63%
q82 7.71 7.76 0.051 100.67%
q83 1.48 2.36 0.879 159.51%
q84 2.71 2.82 0.110 104.06%
q85 7.91 8.11 0.199 102.52%
q86 3.41 3.26 -0.149 95.64%
q87 12.33 12.39 0.061 100.50%
q88 25.13 28.73 3.599 114.32%
q89 3.44 3.43 -0.006 99.82%
q90 4.29 4.35 0.057 101.33%
q91 2.62 2.62 -0.001 99.97%
q92 1.27 1.25 -0.017 98.66%
q93 32.35 30.88 -1.473 95.45%
q94 22.77 22.55 -0.218 99.04%
q9 89.00 87.33 -1.679 98.11%
q5 3.86 3.78 -0.079 97.94%
q96 12.29 11.99 -0.298 97.57%
q97 2.07 2.02 -0.048 97.69%
q98 8.51 13.03 4.522 153.14%
q99 8.51 13.03 4.522 153.14%
total 1972.25 1969.68 -2.575 99.87%

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_6056_time.csv log/native_master_06_16_2024_a08a57c61_time.csv difference percentage
q1 37.54 37.13 -0.404 98.92%
q2 23.89 24.00 0.101 100.42%
q3 40.70 37.23 -3.469 91.48%
q4 32.45 32.75 0.298 100.92%
q5 73.14 71.56 -1.577 97.84%
q6 9.81 8.16 -1.647 83.20%
q7 80.72 80.98 0.265 100.33%
q8 85.24 86.78 1.538 101.80%
q9 118.38 121.57 3.190 102.69%
q10 45.07 44.17 -0.906 97.99%
q11 20.44 21.42 0.977 104.78%
q12 26.35 25.32 -1.029 96.09%
q13 38.73 39.84 1.106 102.86%
q14 19.73 19.23 -0.501 97.46%
q15 32.28 33.05 0.771 102.39%
q16 14.37 14.54 0.170 101.18%
q17 103.29 102.23 -1.057 98.98%
q18 149.08 142.93 -6.143 95.88%
q19 13.81 15.44 1.630 111.80%
q20 28.16 27.99 -0.170 99.40%
q21 262.81 261.41 -1.401 99.47%
q22 12.39 12.41 0.021 100.17%
total 1268.39 1260.15 -8.235 99.35%

Copy link

Run Gluten Clickhouse CI

@zhztheplayer zhztheplayer force-pushed the wip-coalesce-default branch from bac26ca to 46cde2c Compare June 17, 2024 06:39
Copy link

Run Gluten Clickhouse CI

@FelixYBW
Copy link
Contributor

@zhztheplayer @marin-ma Similarly if the batch size is too large, each column exceeds the L2 cache size, the performance should be very bad as well. Can you submit a similar PR to fix this? Split the large batchs into the small one during Split.

@marin-ma
Copy link
Contributor

@zhztheplayer @marin-ma Similarly if the batch size is too large, each column exceeds the L2 cache size, the performance should be very bad as well. Can you submit a similar PR to fix this? Split the large batchs into the small one during Split.

@FelixYBW Do we need a new Operator for it? If it's only for split, we already have it #5536

@zhztheplayer
Copy link
Member Author

Perhaps we can tweak the code of VeloxAppendBatches including renaming it to make it be able to do both appending and slicing by giving it a target batch size range. So the happening of these two operations can be made more consistent.

@zzcclp
Copy link
Contributor

zzcclp commented Jun 18, 2024

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@marin-ma marin-ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@zhztheplayer zhztheplayer merged commit ffdc64a into apache:main Jun 18, 2024
38 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_6056_time.csv log/native_master_06_17_2024_5b87efa56_time.csv difference percentage
q1 34.11 34.50 0.390 101.14%
q2 23.28 23.69 0.419 101.80%
q3 40.46 39.94 -0.515 98.73%
q4 30.62 32.62 2.004 106.54%
q5 67.97 67.62 -0.346 99.49%
q6 8.05 6.63 -1.416 82.41%
q7 81.43 81.08 -0.347 99.57%
q8 83.95 85.44 1.489 101.77%
q9 121.42 120.83 -0.588 99.52%
q10 48.86 46.34 -2.524 94.83%
q11 19.99 19.96 -0.029 99.85%
q12 29.07 25.57 -3.503 87.95%
q13 38.26 38.36 0.099 100.26%
q14 22.05 19.32 -2.726 87.64%
q15 30.34 32.79 2.446 108.06%
q16 14.20 13.93 -0.269 98.11%
q17 102.44 103.20 0.758 100.74%
q18 147.36 147.38 0.023 100.02%
q19 14.75 14.12 -0.632 95.71%
q20 29.34 28.78 -0.567 98.07%
q21 261.33 258.28 -3.044 98.84%
q22 12.22 12.23 0.010 100.09%
total 1261.49 1252.62 -8.870 99.30%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants