Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify Socrata API script and allow multiple assets to be updated #745

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

wrridgeway
Copy link
Member

@wrridgeway wrridgeway commented Feb 26, 2025

This started as trying to allow this script to accept multiple assets, but working with it was a chore and it was clear it could be simplified and improved.

This PR really just shifts a lot of the looping and year handling to the query building and upload functions and removes a lot of overwrite parameter handling involving a count variable by only allowing overwrite to be true the first time it's handled.

@@ -36,6 +36,8 @@ def parse_years(years):
Make sure the years environmental variable is formatted correctly.
"""

if years == "":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to add a bit more error handling. Probably won't ever be needed.

Comment on lines -224 to -233
session.get(
(
"https://datacatalog.cookcountyil.gov/resource/"
+ asset_id
+ ".json?$limit=1"
),
headers={"X-App-Token": app_token},
).raise_for_status()

session.get(url=url, headers={"X-App-Token": app_token}).raise_for_status()
Copy link
Member Author

@wrridgeway wrridgeway Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were checking for a bad URL status twice.

Comment on lines +240 to +243
overwrite = False
print(response.content)

# Return the updated count so that if this function is called in a loop
# the updated count persists.
return count
overwrite = False
Copy link
Member Author

@wrridgeway wrridgeway Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensuring overwrite can never be true more than once per asset.

Comment on lines 132 to 136
conditional if `years` is non-empty. Many of the CCAO's open data assets are
Build a dictionary of Athena compatible SQL queries and their associated years. Many of the CCAO's open data assets are
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the meat of what's changing here.

@wrridgeway wrridgeway marked this pull request as ready for review February 26, 2025 21:09
@wrridgeway wrridgeway requested a review from a team as a code owner February 26, 2025 21:09
@wrridgeway wrridgeway changed the title Allow multiple socrata asset uploads Further simplify socrata API script Feb 26, 2025
@@ -199,7 +199,7 @@ exposures:
name: Data Department
meta:
test_row_count: true
asset_id: vgzx-68gb
asset_id: 3sp4-s4qg
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be reverted before merging.

@wrridgeway wrridgeway marked this pull request as draft February 27, 2025 15:16
@wrridgeway wrridgeway changed the title Further simplify socrata API script Further simplify socrata API script and allow multiple assets to be updated Feb 27, 2025
@wrridgeway wrridgeway changed the title Further simplify socrata API script and allow multiple assets to be updated Simplify socrata API script and allow multiple assets to be updated Feb 27, 2025
@wrridgeway wrridgeway changed the title Simplify socrata API script and allow multiple assets to be updated Simplify Socrata API script and allow multiple assets to be updated Feb 27, 2025
Comment on lines +9 to +21
Comma-separated Socrata assets to update. Options are:
Appeals |
Assessed Values |
Parcel Addresses |
Parcel Proximity |
Parcel Sales |
Parcel Status |
Permits |
Property Tax-Exempt Parcels |
Parcel Universe (Historic) |
Parcel Universe (Current Year) |
Residential Condominium Unit Characteristics |
Single and Multi-Family Improvement Characteristics
Copy link
Member Author

@wrridgeway wrridgeway Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gross, but I can't think of a better way to inform users of their choices since multiple choice isn't an input type and the github actions drop down menu doesn't seem to respect yaml line break formatting.

@@ -88,6 +101,9 @@ def get_asset_info(socrata_asset):
Simple helper function to retrieve asset-specific information from dbt.
"""

if not os.path.isdir("./dbt"):
Copy link
Member Author

@wrridgeway wrridgeway Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only helpful locally, unnecessary for github deployment.

# Comma separated list of years
type: string
description: Years to update or overwrite (comma-separated)
default: 'all'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the default and explained user's choices.

@wrridgeway wrridgeway marked this pull request as ready for review February 28, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant