Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polygon queries and how to not have to filter out an enourmous amount of data #127

Open
n-a-t-e opened this issue Jul 6, 2021 · 1 comment

Comments

@n-a-t-e
Copy link
Member

n-a-t-e commented Jul 6, 2021

Here is an example of a situation where the downloader would have to download a large amount and discard lots of it, because first we download the bounding box from erddap and then filter it to our polygon. The only high frequency dataset so far with many profiles is ADCP data, are there other examples?

Screen Shot 2021-07-06 at 4 01 51 PM

I looked into whether we could download by passing profile IDs to the downloader instead of the polygon if we wanted to. This way the API would take the polygon and send a list of matching profile IDs to the downloader. Initially I thought this wouldn't work because of URL limits. Turns out that is mostly true, Apache has an ~8k character limit by default, this translates to ~600 IOS profile IDs max, but it would totally depend on the dataset as some could have long profile IDs.

So we could have a limit to how many profiles you can download at once, but that gets complicated. Eg there is one point where without a time filter there are 3356 profiles, so with no time filtering the API would have to pass 3356 profile IDs to the downloader, and the downloader would have to split the download up into 6 downloads. And that's for a single point

@n-a-t-e n-a-t-e changed the title polygon search by profile ID polygon download by profile ID Jul 6, 2021
@n-a-t-e n-a-t-e changed the title polygon download by profile ID polygon download by profile IDs Jul 6, 2021
@n-a-t-e n-a-t-e changed the title polygon download by profile IDs Polygon queries and how to not have to filter out an enourmous amount of data Jul 6, 2021
@n-a-t-e n-a-t-e closed this as completed Jul 6, 2021
@n-a-t-e
Copy link
Member Author

n-a-t-e commented Jul 7, 2021

Coincidentally this was just added to Bob's TODO list for ERDDAP. It would be added as a server side function.

https://groups.google.com/g/erddap/c/L0mTUFqldmA/m/mP4-3UjOAAAJ?utm_medium=email&utm_source=footer

and

ERDDAP/erddap#55

@n-a-t-e n-a-t-e reopened this Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant