Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Fix: add validate check for Generate #3682

Merged
merged 11 commits into from
Nov 14, 2023

Conversation

jackylee-ch
Copy link
Contributor

What changes were proposed in this pull request?

  1. Fix the validate check for Generate operator while parsing substrait to velox;
  2. Fix the missing validate check for ifthen, such as select explode(case a is null then array('test') else array(a) end) from test_string;
  3. Fix the validate failed while passing a Alias to Generate, such as select all, a from (select split(a, '2') as al, a from test_string) lateral view explode(al) as all;;
  4. [mirror] add from_json to black_list since it is not support now;

How was this patch tested?

unit and manul tests

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@JkSelf
Copy link
Contributor

JkSelf commented Nov 13, 2023

@jackylee-ch Can we add some unit test to convert this change? Thanks.

@jackylee-ch
Copy link
Contributor Author

@jackylee-ch Can we add some unit test to convert this change? Thanks.

Yeap, I have not finish it, will add it later.

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@jackylee-ch
Copy link
Contributor Author

cc @zhouyuan @JkSelf @PHILO-HE This pr is ready for review now.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some trivial comments. Thanks!

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@JkSelf JkSelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do we need adding another test case that involves an Alias?

@jackylee-ch
Copy link
Contributor Author

LGTM. Do we need adding another test case that involves an Alias?

The test for alias is already added here.

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@jackylee-ch
Copy link
Contributor Author

@JkSelf @PHILO-HE any more suggestions?

Copy link
Contributor

@JkSelf JkSelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for your fix!

@PHILO-HE PHILO-HE merged commit 0f9f196 into apache:main Nov 14, 2023
16 checks passed
@jackylee-ch
Copy link
Contributor Author

Thanks for your review. @JkSelf @PHILO-HE

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3682_time.csv log/native_master_11_13_2023_7cf715b8a_time.csv difference percentage
q1 34.26 34.59 0.330 100.96%
q2 25.16 24.33 -0.836 96.68%
q3 37.34 37.92 0.584 101.56%
q4 35.37 36.03 0.655 101.85%
q5 71.13 71.38 0.246 100.35%
q6 6.87 7.19 0.319 104.64%
q7 85.35 84.44 -0.906 98.94%
q8 86.48 86.27 -0.204 99.76%
q9 120.51 122.37 1.856 101.54%
q10 46.25 47.02 0.769 101.66%
q11 19.29 19.75 0.465 102.41%
q12 24.06 26.95 2.882 111.98%
q13 46.31 46.08 -0.237 99.49%
q14 15.68 14.45 -1.229 92.16%
q15 28.87 28.29 -0.571 98.02%
q16 15.71 15.79 0.079 100.50%
q17 101.14 100.90 -0.247 99.76%
q18 146.34 148.71 2.375 101.62%
q19 12.99 12.92 -0.076 99.41%
q20 28.88 27.47 -1.404 95.14%
q21 224.10 220.71 -3.393 98.49%
q22 12.71 12.86 0.149 101.17%
total 1224.82 1226.42 1.605 100.13%

ulysses-you pushed a commit to ulysses-you/gluten that referenced this pull request Dec 7, 2023
@supermem613
Copy link
Contributor

@PHILO-HE, quick question: why was from_json marked as black-list with this PR? From a cursory glance it seems to run without issue. Are there any issues associated that made it be turned off?

@PHILO-HE
Copy link
Contributor

@PHILO-HE, quick question: why was from_json marked as black-list with this PR? From a cursory glance it seems to run without issue. Are there any issues associated that made it be turned off?

@supermem613, it looks from_json is still not supported in velox. Did you confirm the operator is really offloaded to velox (e.g. by checking DAG in Spark UI)? Gluten has a fallback mechanism, for unsupported function, the relevant operator can fall back to vanilla Spark to still make the query run successfully.

@supermem613
Copy link
Contributor

@PHILO-HE, quick question: why was from_json marked as black-list with this PR? From a cursory glance it seems to run without issue. Are there any issues associated that made it be turned off?

@supermem613, it looks from_json is still not supported in velox. Did you confirm the operator is really offloaded to velox (e.g. by checking DAG in Spark UI)? Gluten has a fallback mechanism, for unsupported function, the relevant operator can fall back to vanilla Spark to still make the query run successfully.

Ah, you are right. I made the mistake of running it with a purely in-memory expression and it didn't fallback as it didn't really activate Gluten. Sorry.

@jackylee-ch jackylee-ch deleted the add_validate_check_for_generate branch June 19, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants