Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Column for Column-Level Visibility in Data Quality Framework Result Table #123

Open
sudeep7978 opened this issue Dec 19, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@sudeep7978
Copy link
Contributor

Add Column for Column-Level Visibility in Data Quality Framework Result Table

Details

1. Purpose of the Enhancement

  • To improve data observability by explicitly linking results to individual columns.
  • To provide better traceability and aid in diagnosing data quality issues at a granular level.
  • To support detailed reporting and actionable insights for data quality assessments.

2. Key Changes to be Made

  • Schema Update:
    • Modified the schema of the result table to include a new column, e.g., affected_column_name.
    • Ensured backward compatibility by verifying the update does not affect existing workflows.
  • Code Updates:
    • Adjusted the data validation logic to capture and populate the affected_column_name field for each rule check.
    • Schema Evolution with AutoMerge:

Enabled Delta Lake's spark.databricks.delta.schema.autoMerge.enabled configuration to allow schema evolution during write operations.
Modified the data quality framework to include the affected_column_name field dynamically if not already present.

@sudeep7978 sudeep7978 added the enhancement New feature or request label Dec 19, 2024
@sudeep7978
Copy link
Contributor Author

PR associated with this enhancement.https://github.com/Nike-Inc/spark-expectations/pull/125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant