Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document known data issues #3788

Open
4 of 8 tasks
lauraherring opened this issue May 1, 2024 · 9 comments
Open
4 of 8 tasks

Document known data issues #3788

lauraherring opened this issue May 1, 2024 · 9 comments
Labels
content Tickets that need review/work by Content Designer eng

Comments

@lauraherring
Copy link
Contributor

lauraherring commented May 1, 2024

The original intent behind this ticket should be rolled into a larger scope of work around refreshing all of out Data/API documentation. Until we're able to do some more planning, let's use this ticket to track known issues or other anomalies so they don't get lost.


As a user, I need a resource page that outlines the known issues in the FAC data and how to work around those limitations so that I can make the most use of the data in my daily work.

This page will live under the data resources section and the URL will be fac.gov/data-resources/data-status

Tasks

Tasks

Preview Give feedback

This ticket won't be marked as complete until it meets the following acceptance criteria:

Tasks

Preview Give feedback
@lauraherring lauraherring added content Tickets that need review/work by Content Designer eng design Layout, UI, etc labels May 1, 2024
@lauraherring lauraherring added this to FAC May 1, 2024
@github-project-automation github-project-automation bot moved this to Triage in FAC May 1, 2024
@jadudm jadudm removed this from FAC May 22, 2024
@jadudm jadudm moved this to Next in FAC Epic Board May 22, 2024
@jadudm jadudm removed this from FAC Epic Board May 22, 2024
@jadudm jadudm added this to FAC May 22, 2024
@github-project-automation github-project-automation bot moved this to Triage in FAC May 22, 2024
@jadudm jadudm removed the status in FAC May 22, 2024
@Leighdiddy Leighdiddy moved this to Backlog in FAC May 22, 2024
@lauraherring lauraherring moved this from Backlog to In Progress in FAC May 28, 2024
@lauraherring lauraherring self-assigned this May 28, 2024
@lauraherring
Copy link
Contributor Author

A draft of this page is built.

@danswick
Copy link
Contributor

danswick commented Jun 25, 2024

Need some clarity re: start/end date off-by-one errors. cc @sambodeme

@lauraherring
Copy link
Contributor Author

@danswick @Leighdiddy I've updated the page with some additional examples and screenshots of what these issues will look like to users and would love feedback.

@lauraherring lauraherring moved this from In Progress to Blocked in FAC Jul 9, 2024
@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 17, 2024

2016 through 2022 historic audits from Census were migrated to GSAFAC. Out of the 275+K audits, two audits could not be migrated due to severe data quality issues. The audit year and dbkey information of these audits are:

  1. audit_year = 2019, dbkey = 243753
  2. audit_year = 2022, dbkey = 247680

Associated errors during migration:

  1. audit_year = 2019, dbkey = 243753 - Validation error - 'FAC OVERRIDE' is not valid under any of the given schemas
  2. audit_year = 2022, dbkey = 247680 - Cross validation error - "According to Uniform Guidance section 200.520(a), biennial audits cannot be considered 'low-risk.' Please make the necessary changes to the 'Audit Period Covered' or indicate that the auditee did NOT qualify as low-risk."

Epic references:
#3364
#3994

@danswick danswick changed the title Create known data issues documentation page Document known data issues Oct 25, 2024
@gsa-suk
Copy link
Contributor

gsa-suk commented Oct 25, 2024

During investigation of census raw historic data for a helpdesk ticket, for audit years 2016 through 2022, the following 3 audits were found to contain unusual data with respect to dates spanning multiple years.

Screenshot 2024-10-24 at 11 41 27 AM

These audits were migrated from Census to GSA FAC. The report_ids would contain the audit years from raw census data shown above. The GSA FAC obtains audit year from FYENDDATE and stores it for audit_year. Hence, the year in the report_id will not match what is displayed for audit_year.

report_id, audit_year for the above mentioned audits:

2021-07-CENSUS-0000182173, 2022
2019-12-CENSUS-0000193687, 2021
2016-12-CENSUS-0000244335, 2018

@danswick
Copy link
Contributor

I'm going to move this back into the backlog for now. It's something we still want to do and isn't technically blocked, but isn't a top priority at the moment. Let's continue documenting issues here in the meantime.

@danswick
Copy link
Contributor

Based on some HD convos with @gsa-suk and @James-Paul-Mason, we think there are quite a few PDFs missing from the bulk PDF transfer we got from Census. I think we should create an updated list of these instances by looking for null entries in the renaming.sqlite table in this folder on Drive (access limited) or possibly some other more definitive record. @gsa-suk, would you be able to create a quick export of those records? It would be great to see:

  • Original filename
  • Audit Year
  • DBKEY
  • Destination REPORT_ID (I'm pretty sure this is included in the table we looked at the other day)

@gsa-suk
Copy link
Contributor

gsa-suk commented Nov 20, 2024

@danswick Pulling the null entries in renaming.sqlite, this is the list of historical audits that are missing pdf reports. https://docs.google.com/spreadsheets/d/1i2oaDPvgdJKHwdL10M1BgHBrUKR6aqV0vW9S4aBVzag/edit?gid=14296428#gid=14296428

@gsa-suk
Copy link
Contributor

gsa-suk commented Nov 21, 2024

Some historical audits only have pdf reports and no SF_SAC data. The list is here - https://docs.google.com/spreadsheets/d/1L1F-kY60vA4zxzaeM1ACD1LyLOQFd3iyGCFpzqWc2cs/edit?gid=2106877303#gid=2106877303

@Leighdiddy Leighdiddy removed the design Layout, UI, etc label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Tickets that need review/work by Content Designer eng
Projects
Status: Backlog
Development

No branches or pull requests

4 participants