Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UT] Add a tool to validate any unary expression with all its accepted types #6392

Merged
merged 5 commits into from
Jul 12, 2024

Conversation

PHILO-HE
Copy link
Contributor

@PHILO-HE PHILO-HE commented Jul 10, 2024

What changes were proposed in this pull request?

There are a few functions that are only partially supported, e.g., some input types are not supported. This PR is just for helping us to know the gaps.

This PR gets expression list and registered expression builders from Spark. In the validation process, the expression's input data doesn't really matter, but input type does. So we can use literal NULL child with specific type to build an expression instance. After this, we simply construct a Gluten plan with a dummy child node. Then, go into GlutenPlan's validation code path.

This is just the first patch for project function validation. Will have more PRs for other functions, like binary expressions, aggregate functions, etc.

See partial output:

## cast validation passes: cast from ByteType to ShortType
## cast validation passes: cast from ByteType to IntegerType
## cast validation passes: cast from ByteType to LongType
## cast validation passes: cast from ByteType to FloatType
## cast validation passes: cast from ByteType to DoubleType
## cast validation passes: cast from ByteType to DecimalType(5,1)
## cast validation passes: cast from ByteType to StringType
## cast validation passes: cast from ByteType to BinaryType
!! cast validation fails: cast from ByteType to TimestampType
## cast validation passes: cast from ShortType to ByteType
## cast validation passes: cast from ShortType to IntegerType
## cast validation passes: cast from ShortType to LongType
## cast validation passes: cast from ShortType to FloatType
## cast validation passes: cast from ShortType to DoubleType
## cast validation passes: cast from ShortType to DecimalType(5,1)
## cast validation passes: cast from ShortType to StringType
## cast validation passes: cast from ShortType to BinaryType
!! cast validation fails: cast from ShortType to TimestampType
## cast validation passes: cast from IntegerType to ByteType
## cast validation passes: cast from IntegerType to ShortType
## cast validation passes: cast from IntegerType to LongType
## cast validation passes: cast from IntegerType to FloatType
## cast validation passes: cast from IntegerType to DoubleType
## cast validation passes: cast from IntegerType to DecimalType(5,1)
## cast validation passes: cast from IntegerType to StringType
## cast validation passes: cast from IntegerType to BinaryType
!! cast validation fails: cast from IntegerType to TimestampType
## cast validation passes: cast from LongType to ByteType
## cast validation passes: cast from LongType to ShortType
## cast validation passes: cast from LongType to IntegerType
## cast validation passes: cast from LongType to FloatType
## cast validation passes: cast from LongType to DoubleType
## cast validation passes: cast from LongType to DecimalType(5,1)
## cast validation passes: cast from LongType to StringType
## cast validation passes: cast from LongType to BinaryType
!! cast validation fails: cast from LongType to TimestampType
## cast validation passes: cast from FloatType to ByteType
## cast validation passes: cast from FloatType to ShortType
## cast validation passes: cast from FloatType to IntegerType
## cast validation passes: cast from FloatType to LongType
## cast validation passes: cast from FloatType to DoubleType
## cast validation passes: cast from FloatType to DecimalType(5,1)
## cast validation passes: cast from FloatType to StringType
!! cast validation fails: cast from FloatType to TimestampType
## cast validation passes: cast from DoubleType to ByteType
## cast validation passes: cast from DoubleType to ShortType
## cast validation passes: cast from DoubleType to IntegerType
## cast validation passes: cast from DoubleType to LongType
## cast validation passes: cast from DoubleType to FloatType
## cast validation passes: cast from DoubleType to DecimalType(5,1)
## cast validation passes: cast from DoubleType to StringType
!! cast validation fails: cast from DoubleType to TimestampType
## cast validation passes: cast from DecimalType(5,1) to ByteType
## cast validation passes: cast from DecimalType(5,1) to ShortType
## cast validation passes: cast from DecimalType(5,1) to IntegerType
## cast validation passes: cast from DecimalType(5,1) to LongType
## cast validation passes: cast from DecimalType(5,1) to FloatType
## cast validation passes: cast from DecimalType(5,1) to DoubleType
## cast validation passes: cast from DecimalType(5,1) to StringType
!! cast validation fails: cast from DecimalType(5,1) to TimestampType
## cast validation passes: cast from StringType to ByteType
## cast validation passes: cast from StringType to ShortType
## cast validation passes: cast from StringType to IntegerType
## cast validation passes: cast from StringType to LongType
## cast validation passes: cast from StringType to FloatType
## cast validation passes: cast from StringType to DoubleType
## cast validation passes: cast from StringType to DecimalType(5,1)
## cast validation passes: cast from StringType to BinaryType
## cast validation passes: cast from StringType to DateType
!! cast validation fails: cast from StringType to TimestampType
## cast validation passes: cast from BinaryType to ByteType
## cast validation passes: cast from BinaryType to ShortType
## cast validation passes: cast from BinaryType to IntegerType
## cast validation passes: cast from BinaryType to LongType
## cast validation passes: cast from BinaryType to FloatType
## cast validation passes: cast from BinaryType to DoubleType
## cast validation passes: cast from BinaryType to DecimalType(5,1)
## cast validation passes: cast from BinaryType to StringType
## cast validation passes: cast from BinaryType to DateType
!! cast validation fails: cast from BinaryType to TimestampType
!! cast validation fails: cast from DateType to ByteType
!! cast validation fails: cast from DateType to ShortType
!! cast validation fails: cast from DateType to IntegerType
!! cast validation fails: cast from DateType to LongType
!! cast validation fails: cast from DateType to FloatType
!! cast validation fails: cast from DateType to DoubleType
!! cast validation fails: cast from DateType to DecimalType(5,1)
## cast validation passes: cast from DateType to StringType
## cast validation passes: cast from DateType to TimestampType
!! cast validation fails: cast from TimestampType to ByteType
!! cast validation fails: cast from TimestampType to ShortType
!! cast validation fails: cast from TimestampType to IntegerType
!! cast validation fails: cast from TimestampType to LongType
!! cast validation fails: cast from TimestampType to FloatType
!! cast validation fails: cast from TimestampType to DoubleType
!! cast validation fails: cast from TimestampType to DecimalType(5,1)
!! cast validation fails: cast from TimestampType to StringType
!! cast validation fails: cast from TimestampType to DateType
!! cast validation fails: cast from ArrayType(IntegerType,true) to StringType
!! cast validation fails: cast from MapType(StringType,IntegerType,true) to StringType
!! cast validation fails: cast from StructType(StructField(a,StringType,true), StructField(b,IntegerType,true)) to StringType

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@PHILO-HE PHILO-HE force-pushed the func-type-support-state branch from 247b6c6 to f1f4ab6 Compare July 10, 2024 07:42
Copy link

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor Author

@rui-mo, do you have any comment?

@PHILO-HE PHILO-HE changed the title [UT] Add a test to validate any unary expression with its accepted types [UT] Add a tool to validate any unary expression with all its accepted types Jul 11, 2024
@zhztheplayer
Copy link
Member

Does the change apply to CH backend as well?

@PHILO-HE
Copy link
Contributor Author

Does the change apply to CH backend as well?

CH backend will also run this. But as it doesn't implement doValidate API for native validation, this tool doesn't produce correct result. I will disable it for CH backend. Thanks!

Comment on lines 112 to 113
// scalastyle:off
logInfo("!! cast validation fails: cast from " + from + " to " + to)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should every // scalastyle:off requires for a corresponding // scalastyle:on? I am not sure about it but didn't see independent usages of // scalastyle:off before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrongly kept. I will remove it. Thanks!

Copy link

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor Author

When doing expression converter, it can go into CH's expression validation path in scala code. So maybe we can keep it run for CH. cc @zzcclp

@PHILO-HE
Copy link
Contributor Author

@zhztheplayer, do you have any other comment? Thanks!

Copy link
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhztheplayer zhztheplayer merged commit 86e5f9d into apache:main Jul 12, 2024
44 checks passed
val castExpr = Cast(generateChildExpression(from), to)
if (castExpr.checkInputDataTypes().isSuccess) {
val glutenProject = generateGlutenProjectPlan(castExpr)
if (glutenProject.doValidate().isValid) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the api of the ValidationResult had already changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zzcclp, yes, I will fix it soon. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI. Just noted hongze's fix: #6418

yma11 added a commit to yma11/gluten that referenced this pull request Jul 15, 2024
yma11 added a commit to yma11/gluten that referenced this pull request Jul 15, 2024
yma11 added a commit to yma11/gluten that referenced this pull request Jul 15, 2024
yma11 added a commit to yma11/gluten that referenced this pull request Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants