Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data quality report #68

Open
pmayd opened this issue Sep 26, 2023 · 2 comments
Open

data quality report #68

pmayd opened this issue Sep 26, 2023 · 2 comments

Comments

@pmayd
Copy link
Collaborator

pmayd commented Sep 26, 2023

With what we know about Looker Studio it should be easy/feasible to create a data quality dashboard/report that list important statistics to all columns of interest, like patient data at the moment.

Output could be a dashboard with a table and/or graph for each column in patient data listing interesting statistics like number of non-empty entries, number of missing entries, min, max, avg for continuous variables, number of distinct values and most common values for categorical data, etc.

@lboel
Copy link
Collaborator

lboel commented Oct 25, 2023

Quick win maybe a R Shiny app based on this https://appsilon.com/automated-r-data-quality-reporting/. Not the best option but at least it`s a quick approach without much effort

@pmayd
Copy link
Collaborator Author

pmayd commented Oct 25, 2023

Absolutely for it, if it is easy to use and applicable why not we don't have to reinvent the wheel. And there are also tools for GCP of course so we should also invest services from Google that analyze the data and show the lineage for example!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants