Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Add plan validation util for debugging validate process #3972

Merged
merged 3 commits into from
Dec 13, 2023

Conversation

Yohahaha
Copy link
Contributor

@Yohahaha Yohahaha commented Dec 7, 2023

What changes were proposed in this pull request?

Add plan_validation_util executable for debugging validate process in native side, and refine docs.

Copy link

github-actions bot commented Dec 7, 2023

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@Yohahaha Yohahaha changed the title [WIP] Add plan validation test [VL] Add plan validation util for debugging validate process Dec 11, 2023
@Yohahaha
Copy link
Contributor Author

@marin-ma @PHILO-HE please help to review this patch, thank you!

auto jsonPlan = gluten::substraitFromPbToJson("Plan", planData, planSize);
LOG(INFO) << std::string(50, '#') << " received substrait::Plan: for validation";
} catch (const std::exception& e) {
LOG(WARNING) << "Error converting Substrait plan to JSON: " << e.what();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably "Error converting Substrait plan for validation to JSON:"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


```
Run the query by spark-shell. Get the Substrait plan from executor's stdout.
Run the query with config `spark.gluten.sql.debug=true` and get the Substrait json plan from log, then save it as a JSON file, suppose the name is "plan.json".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need config spark.gluten.sql.columnar.backend.velox.glogSeverityLevel=0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, btw could we open glog when debug mode enabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, btw could we open glog when debug mode enabled?

Sounds good!

@marin-ma
Copy link
Contributor

marin-ma commented Dec 12, 2023

Thanks for adding this feature! Could you provide some output after this change?
I'm not sure that all validation plans can be successfully logged. Could you check if we need to apply this fix to the validation plan on java side?
https://github.com/oap-project/gluten/pull/3969/files#diff-a7356fa0629d859807e18644ff6556e6babc825fc25c95a967ce28abe6d100d9R577-R578

@Yohahaha
Copy link
Contributor Author

Thanks for adding this feature! Could you provide some output after this change? I'm not sure that all validation plans can be successfully logged. Could you check if we need to apply this fix to the validation plan on java side? https://github.com/oap-project/gluten/pull/3969/files#diff-a7356fa0629d859807e18644ff6556e6babc825fc25c95a967ce28abe6d100d9R577-R578

json plan can be printed if has no agg. your patch is great, I will revert doc modification.

@Yohahaha
Copy link
Contributor Author

         > select count(*) from test;
I1212 10:35:08.637563 3619404 VeloxJniWrapper.cc:98] ################################################## received substrait::Plan: for validation
I1212 10:35:08.637595 3619404 VeloxJniWrapper.cc:99] {"relations":[{"root":{"input":{"read":{"common":{"direct":{}},"baseSchema":{"struct":{}}}}}}]}
[libprotobuf WARNING google/protobuf/util/internal/json_objectwriter.cc:52] JsonObjectWriter was not fully closed.
W1212 10:35:08.784191 3619404 VeloxJniWrapper.cc:101] Error converting Substrait plan for validation to JSON: BinaryToJsonStream returned INTERNAL:Invalid type URL, type URLs must be of the form '/<typename>', got: type.googleapis.com/google.protobuf.StringValue



         > select * from test;
I1212 10:35:41.984709 3619404 VeloxJniWrapper.cc:98] ################################################## received substrait::Plan: for validation
I1212 10:35:41.984735 3619404 VeloxJniWrapper.cc:99] {"relations":[{"root":{"input":{"read":{"common":{"direct":{}},"baseSchema":{"names":["id","b"],"struct":{"types":[{"i32":{"nullability":"NULLABILITY_NULLABLE"}},{"bool":{"nullability":"NULLABILITY_NULLABLE"}}]},"columnTypes":["NORMAL_COL","NORMAL_COL"]}}}}}]}

@Yohahaha
Copy link
Contributor Author

@marin-ma could we merge or any more comments?

@PHILO-HE
Copy link
Contributor

Hi @Yohahaha, could you document how to use it? Maybe, in HowTo.md. Thanks!

@Yohahaha
Copy link
Contributor Author

Hi @Yohahaha, could you document how to use it? Maybe, in HowTo.md. Thanks!

Added.

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work! Will merge the PR soon (Rong is taking leave, if any more comment after that, let's fix it in follow-up PR).

@PHILO-HE PHILO-HE merged commit 0f5d9e1 into apache:main Dec 13, 2023
14 of 15 checks passed
@Yohahaha Yohahaha deleted the planvalidator-test branch December 13, 2023 08:03
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3972_time.csv log/native_master_12_12_2023_8c6f1643e_time.csv difference percentage
q1 33.04 33.75 0.702 102.13%
q2 24.85 24.88 0.030 100.12%
q3 37.45 37.71 0.256 100.68%
q4 40.24 37.86 -2.383 94.08%
q5 72.22 72.83 0.609 100.84%
q6 7.06 7.05 -0.006 99.91%
q7 85.77 86.50 0.726 100.85%
q8 86.62 86.24 -0.379 99.56%
q9 128.02 125.50 -2.513 98.04%
q10 43.70 44.68 0.978 102.24%
q11 20.65 20.21 -0.433 97.90%
q12 25.10 25.06 -0.043 99.83%
q13 46.43 46.58 0.150 100.32%
q14 15.61 19.99 4.373 128.01%
q15 27.90 27.55 -0.356 98.73%
q16 15.76 15.86 0.098 100.62%
q17 104.90 103.27 -1.628 98.45%
q18 150.40 151.33 0.936 100.62%
q19 12.81 13.89 1.080 108.43%
q20 28.71 28.68 -0.026 99.91%
q21 228.53 229.12 0.593 100.26%
q22 13.84 13.95 0.111 100.80%
total 1249.62 1252.49 2.878 100.23%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants