Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3962][VL] Respect parsed attribute name and remove column name validate logic #3963

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

Yohahaha
Copy link
Contributor

@Yohahaha Yohahaha commented Dec 7, 2023

What changes were proposed in this pull request?

Respect parsed attribute name from Spark, sum(a) is a valid attr name in Spark.

Remove column name validate logic safely since SparkTokenizer already added.

(Fixes: #3962)

How was this patch tested?

added UT.

Copy link

github-actions bot commented Dec 7, 2023

#3962

Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

Yohahaha commented Dec 7, 2023

Spark33 allow read parquet column with special characters but Spark32 does not.... details see apache/spark#35229

Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Dec 7, 2023

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

Yohahaha commented Dec 8, 2023

@PHILO-HE @rui-mo @ulysses-you please help review, thank you!

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work! Just few comments.

Copy link

github-actions bot commented Dec 8, 2023

Run Gluten Clickhouse CI

1 similar comment
Copy link

github-actions bot commented Dec 8, 2023

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

cc @waitinfuture please help rerun failed ci job, thank you!

Copy link

Run Gluten Clickhouse CI

fix

fix format

fix ut

fix
Copy link

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Contributor Author

@PHILO-HE @ulysses-you @rui-mo any more comments?

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks!

Copy link
Contributor

@waitinfuture waitinfuture left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@waitinfuture waitinfuture merged commit 8edcb72 into apache:main Dec 13, 2023
17 checks passed
@Yohahaha Yohahaha deleted the attr-name branch December 13, 2023 05:45
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3963_time.csv log/native_master_12_12_2023_8c6f1643e_time.csv difference percentage
q1 33.89 33.75 -0.140 99.59%
q2 27.48 24.88 -2.595 90.56%
q3 37.58 37.71 0.124 100.33%
q4 39.51 37.86 -1.656 95.81%
q5 72.23 72.83 0.599 100.83%
q6 7.02 7.05 0.029 100.41%
q7 83.96 86.50 2.538 103.02%
q8 87.97 86.24 -1.726 98.04%
q9 126.84 125.50 -1.339 98.94%
q10 45.71 44.68 -1.026 97.76%
q11 20.94 20.21 -0.726 96.53%
q12 27.39 25.06 -2.326 91.51%
q13 46.81 46.58 -0.234 99.50%
q14 15.61 19.99 4.382 128.08%
q15 29.11 27.55 -1.566 94.62%
q16 15.49 15.86 0.367 102.37%
q17 103.03 103.27 0.238 100.23%
q18 150.83 151.33 0.502 100.33%
q19 12.83 13.89 1.060 108.26%
q20 26.96 28.68 1.729 106.41%
q21 226.18 229.12 2.941 101.30%
q22 14.05 13.95 -0.093 99.34%
total 1251.41 1252.49 1.083 100.09%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CORE] Do not modify attribute name in transform scan operator
7 participants