Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: refactor nzshm_common.location.location.py around reading from CSV files #39

Open
DrCuriosity opened this issue Apr 12, 2024 · 1 comment

Comments

@DrCuriosity
Copy link
Collaborator

There is work going in for get_locations(locations, resolution) on issue #36 (PR #37) that checks incoming values against the filesystem to see if they might be a CSV file to load from.

This might be better off split into a separate get_locations_csv(file, resolution) function for a few reasons:

  • Speculatively calling Path(location_id).exists() on every incoming location ID might cause performance issues when a large number of locations are being requested
  • There's the possibility that files could exist that have the same name as a location ID string, leading to unexpected errors or erroneous results
  • Security-wise, it's good to guard against people probing a system via unexpected avenues. If for example I was calling this library function with input passed through from a web form, errors from something like getLocations(["CHC", "AKL, ".dockerenv"]) could tell an attacker whether the code was running in a Docker container.

A separate get_locations_csv(file_location, resolution) function would make interacting with the host file system more intentional. It also provides us with an opportunity to take in either a Path or a file-like object (with a type under typing.TextIO) to read from, which would allow users to read CSV data in from non-traditional sources (e.g. S3 buckets) as well as from their local filesystems.

@DrCuriosity DrCuriosity changed the title Refactor nzshm_common.location.location.py around reading from CSV files Proposal: refactor nzshm_common.location.location.py around reading from CSV files Apr 12, 2024
@DrCuriosity
Copy link
Collaborator Author

Consider as a nice-to-have: overriding lat/lon column names on ingest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants