Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Functionality to Download Datasets #3

Open
5 tasks
TheCedarPrince opened this issue Apr 26, 2024 · 7 comments
Open
5 tasks

Create Functionality to Download Datasets #3

TheCedarPrince opened this issue Apr 26, 2024 · 7 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@TheCedarPrince
Copy link

TheCedarPrince commented Apr 26, 2024

Issue Description

Difficulty: Beginner

Time: 10 hours

Description: This issue aims to create a Julia function within our geospatial package that can download shapefiles from the TIGER database based on pre-defined URLs. The function should allow users to control various aspects of the downloading process, such as refreshing cached data and displaying progress bars.

Requirements

  • Download shapefiles from the TIGER database using the provided URL.
    • Consider using DataDeps.jl for this
    • Create DataDeps registration method
  • Include option to force a refresh downloaded data
  • progress_bar option to display download progress.
  • Return a path for where this data was downloaded to
    • It can then be read into another function to interpret the geospatial files
  • Function tests

Expected Outcomes

The created Julia function should:

  1. Download shapefiles from the TIGER database based on the provided URL and filter options.
  2. Return the path to the shapefile
  3. Allow for controlling download behavior through additional options (progress bar, refreshing, etc.).

The API could look something like this:

function download_tigerfile(
	URL::String; 
	progress_bar::Bool = true, 
	refresh::Bool = false
)

#= 

Your awesome code here!

=#

	return download_path

end

And then when you run it it would look like this:

julia> download_function(my_url)
100% complete!
@info Files downloaded to /home/datadeps/shapefiles
@TheCedarPrince TheCedarPrince added good first issue Good for newcomers help wanted Extra attention is needed labels Apr 26, 2024
@asinghvi17
Copy link
Member

That sounds good! This would be the basic functionality, right? And then some layer on top which does geometry corrections etc.

@TheCedarPrince
Copy link
Author

Exactly. I am breaking these tasks out very granularly such that they could be worked on somewhat independently. But yea, they'll compose together at some point to do exactly what you are thinking about.

@asinghvi17
Copy link
Member

Note here that any downloaded zipfiles must have a .zip extension for Shapefile.jl to be able to load them properly after JuliaGeo/Shapefile.jl#113 lands. I can probably also add a dispatch for ::IO objects there, but that would be in the next release after.

@felixcremer
Copy link
Member

You might want to look at GADM.Jl for inspiration. That is doing something similar for the GADM dataset.

@TheCedarPrince
Copy link
Author

That's a good call @felixcremer -- thanks to you as well as to @asinghvi17 's note about zip and Rasters.jl.

Quick question @asinghvi17: do you know of any methods for showing a progress bar of a download? After googling a bit, I couldn't really find anything in existence for Julia unfortunately.

@asinghvi17
Copy link
Member

asinghvi17 commented Apr 29, 2024

I believe Downloads.download has a kwarg progress :: (total::Integer, now::Integer) --> Any, which could be given as a local closure that increments a progress bar...

@asinghvi17
Copy link
Member

ProgressMeter.jl structs are mutable as well, so you can set n directly in the closure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
Development

No branches or pull requests

3 participants