feat: par_filter_map_collect #46

claudioap · 2024-07-21T14:26:17Z

I have a usecase that consists of bootstrapping a database with a subset of the planet. This subset is somewhat sparse and would not cause memory problems if it was to be eagerly allocated. I imagine other people to have similar requirements: basically being able to run a filter_map on the data.

One can do this in series with the supplied functions, however, not in parallel, which is a requirement for anyone who's doing frequent runs on big dumps.

I generalized my approach in this PR. Albeit more granular than what #42 is offering, this is slightly less elegant as it assumes that you want your blobs decoded and that you want to run through OsmData, which might not be true.

I could not get flatten() to let errors out and the workaround not a lazy iterator. This approach can easily fill someone's RAM if the filters happen to be too broad. Nonetheless, I've been reliably using it for a while now and just had the time to upstream it. Hope it serves someone else.

Cheers

Add par_filter_map_collect function.

d0ca186

claudioap force-pushed the par_filter_map branch from b390922 to d0ca186 Compare July 21, 2024 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: par_filter_map_collect #46

feat: par_filter_map_collect #46

claudioap commented Jul 21, 2024

feat: par_filter_map_collect #46

Are you sure you want to change the base?

feat: par_filter_map_collect #46

Conversation

claudioap commented Jul 21, 2024