-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ColumnSumCheck should treat an unexpected type as a test failure, not exception #46
Comments
@colindean @samratmitra-0812 I think handling as an exception in |
I can see it either way. Think of the failure mode when DV encounters an unsupported column type:
DV expects the user to inspect failures in the report at the end of a run. Should DV expect the user to know the type of the data in a column, especially when DV's config cannot necessarily reflect that type? That is, YAML only knows JSON numeric types, which are integer and double (I think they're called "fractions" technically in JSON) and exponential. Also, is it more user-friendly to allow a validation run to complete or should it die immediately at runtime? Immediate exit for normally long-running processes should be focused at the start of the run. If there's a way we can detect supported types in One of the things I liked about my expressions approach to columnSumCheck was pushing the type handling to the Spark/Hive level, so we didn't need to maintain a list of supported data types that generally ends up being an exhaustive list of subtypes of |
@colindean This is a good point. Maybe too general of a statement, but at level of config check we can throw exceptions, it will be a fast failure to your point. And at deeper levels (like quick check) we just log the error so that other checks still have the chance to complete. |
@colindean @phpisciuneri Datatype verification in |
@samratmitra-0812 pointed out:
The text was updated successfully, but these errors were encountered: