Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Disable columnar table cache by default #3488

Conversation

gaoyangxiaozhu
Copy link
Contributor

disable the columnar table cache in default

related issue 3456

How was this patch tested?

pass unit tests

@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@github-actions
Copy link

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor

@ulysses-you, could you please review this PR?

@zhztheplayer zhztheplayer changed the title disable columnar table cache in default [VL] Disable columnar table cache by default Oct 23, 2023
@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

ulysses-you
ulysses-you previously approved these changes Oct 24, 2023
Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if test pass

@gaoyangxiaozhu
Copy link
Contributor Author

gaoyangxiaozhu commented Oct 24, 2023

not sure why but the GLUTEN - InMemoryRelation statistics fail online

image

but the test passed in my local verification

@github-actions
Copy link

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor

@gaoyangxiaozhu it seems you only change the ut of Spark-3.3. We should also change the ut of Spark-3.2

@gaoyangxiaozhu
Copy link
Contributor Author

@gaoyangxiaozhu it seems you only change the ut of Spark-3.3. We should also change the ut of Spark-3.2

hey @ulysses-you i think GLUTEN - InMemoryRelation statistics only be added in spark 33, right ?

@ulysses-you
Copy link
Contributor

I mean we lose the test coverage for Spark3.2 since this pr does not change the columnar cache config for Spark3.2.

yes, the failed test is for Spark3.3, and it also passed in my local. I'm not sure why github action failed. Can you rebase the pr and push it again? thank you

@github-actions
Copy link

Run Gluten Clickhouse CI

@gaoyangxiaozhu gaoyangxiaozhu force-pushed the gayangya/disable_columnar_table_cache_default branch from 092913c to a52f7a3 Compare October 24, 2023 09:37
@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@@ -27,6 +29,7 @@ class GlutenCachedTableSuite

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea is to add config to system properties:
sys.props.put(GlutenConfig.COLUMNAR_TABLE_CACHE_ENABLED.key, "true")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can anyone from Gluten to help check why the test case fail in CI since we already enable the conf ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that, Spark has a global cache for cache serializer. So in one jvm, if there are some table cache test suites which use the default cache serializer running before this test, then the default cache serializer had been cached. Therefore, we can not change the cache serializer to columnar for the specified test.

We can set this config by sys.props.put(GlutenConfig.COLUMNAR_TABLE_CACHE_ENABLED.key, "true") to work around it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have some issue to run total ut locally to verify.
@ulysses-you you mean do chang like below ?
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link

github-actions bot commented Nov 2, 2023

Run Gluten Clickhouse CI

1 similar comment
Copy link

github-actions bot commented Nov 2, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 2, 2023

Run Gluten Clickhouse CI

@@ -24,9 +26,11 @@ class GlutenCachedTableSuite
extends CachedTableSuite
with GlutenSQLTestsTrait
with AdaptiveSparkPlanHelper {

// for temporarily disable the columnar table cache globally.
sys.props.put(GlutenConfig.COLUMNAR_TABLE_CACHE_ENABLED.key, "true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also do this for Spark3.4 which is merged recently

Copy link

github-actions bot commented Nov 3, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 3, 2023

Run Gluten Clickhouse CI

@ulysses-you ulysses-you merged commit 1788834 into apache:main Nov 3, 2023
16 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3488_time.csv log/native_master_11_02_2023_78104be3e_time.csv difference percentage
q1 34.58 34.98 0.392 101.13%
q2 25.28 24.86 -0.421 98.34%
q3 39.85 38.38 -1.468 96.32%
q4 38.41 37.54 -0.872 97.73%
q5 72.01 71.16 -0.851 98.82%
q6 8.91 7.29 -1.617 81.84%
q7 85.61 88.90 3.295 103.85%
q8 87.37 87.43 0.056 100.06%
q9 121.21 120.87 -0.338 99.72%
q10 52.10 51.83 -0.274 99.47%
q11 20.03 19.61 -0.420 97.90%
q12 27.84 26.28 -1.555 94.41%
q13 49.73 48.19 -1.536 96.91%
q14 17.03 18.42 1.386 108.14%
q15 34.30 32.90 -1.405 95.90%
q16 16.03 15.99 -0.042 99.74%
q17 100.93 101.55 0.621 100.62%
q18 148.32 147.25 -1.074 99.28%
q19 16.84 17.00 0.161 100.95%
q20 32.74 31.82 -0.914 97.21%
q21 226.43 226.44 0.012 100.01%
q22 13.17 13.39 0.220 101.67%
total 1268.72 1262.07 -6.645 99.48%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants