Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple of changes/enhancements #135

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

wouellette
Copy link

@wouellette wouellette commented Mar 19, 2019

Hi Drew,

Sorry for taking so long to get back to this, but in the mean time I have been working on my own head and I came up with more than just the "label_area" improvement suggested in the previous pull request, so decided to create a new one here to discuss the additional changes I came up with, and that I think would be of common interest, especially the support for create labels across country borders. See list of improvements below:

  • country can be provided as list() instead of single string, thus allowing the download and label generation for multiple countries in one go. This stemmed from the need to generate labels for AOIs overlapping two countries.
  • The initial "label_area" suggestion for the ml_type=classification from Add "label_area" attribute to classification.geojson #128 is taken up here.
  • Changed the check for geojson availability, because I didn't understand the initial geojson attribute of providing one's own labelled features. When would that be the case? I interpreted it as an alternative to providing the AOI instead of the bounding_box+country combination. Maybe keeping the original geojson attribute and adding an aoi attribute to cover my intended functionality could make sense.
  • Create labels based on multiple country .mbtile files (related to first point where countries can be provided as list).

Let's discuss on the common ground for this pull request, what could be taken up from what couldn't, and once we agree I can work on the tests.

Cheers,

William

with open(download_file, 'wb') as w:
for line in r:
w.write(line)
for ctr in country:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterating over country list

mbtiles_file_zoomed = op.join(dest_folder, '{}-z{!s}.mbtiles'.format(country, zoom))

if not op.exists(mbtiles_file_zoomed):
filtered_geo = kwargs.get('geojson') or op.join(dest_folder, '{}.geojson'.format(country))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I am not sure I understand why kwargs.get('geojson') is picked up here for the variable. This would assume that the geojson provided in the config.json file contains the label features on which to generate the labelled output for classification, object detection or segmentation.

However, I cannot see a use case where someone would provide their labelled features as a standalone geojson file, because then would they really need to use the label-maker?
If we keep the geojson attribute in its current defintion, I still see the value of an aoi field to provide the AOI in the form of geometries, as an alternative to the country+bbox combination.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wouellette the geojson attribute is provided so users can add a standalone GeoJSON file. label-maker still provides value here in that it will chip up the imagery and GeoJSON labels, rasterize or create object labels, etc. We need to keep this attribute as it's used in a number of existing workflows.

@@ -85,12 +85,11 @@ def cli():

# custom validation for top level keys
# require either: country & bounding_box or geojson
if 'geojson' not in config.keys() and not ('country' in config.keys() and 'bounding_box' in config.keys()):
raise Exception('either "geojson" or "country" and "bounding_box" must be present in the configuration JSON')
if not ('country' in config.keys() and 'geojson' in config.keys()) and not ('country' in config.keys() and 'bounding_box' in config.keys()):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change relates to the change of the filtered_geo variable, where I interpreted the geojson attribute to be an alternative to provide the AOI instead of country+bbox combination.

@wouellette wouellette changed the title Couple of enhancements Couple of changes/enhancements Mar 19, 2019
@drewbo
Copy link
Contributor

drewbo commented Mar 19, 2019

@wouellette 👍 on multiple countries and the label_area addition, see my comments inline for the geojson use case (the implementation is a bit kludgy but works for now)

@wouellette
Copy link
Author

Apology on the clumsy implementation, I was planning on reworking it, but just wanted to discuss the value of the changes first.

About the geojson attribute, it's well understood. Now, would an aoi geometry attribute (provided as geojson or other format) as an alternative to bounding_box make sense? I personally use it to guarantee that the labelled output has complete coverage over the expected geographic extent, and it's more natural to pass a file to configuration than a Bbox if you have a very specific area over which to generate labels.

label_bool2 = label_bool
label_area2 = label_area
if ctr == country[-1]:
json.dump(fc(features), open(op.join(dest_folder, f'classification_{zoom}.geojson'), 'w'))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added zoom to file name to identify the zoom level at which labelled output was generated.

@drewbo
Copy link
Contributor

drewbo commented Mar 20, 2019

Ha, apologies, I meant that my implementation of the geojson key was the issue, not your new code. Thanks again for the contribution and I'll take a look sometime this week

label_maker/label.py Outdated Show resolved Hide resolved
@drewbo
Copy link
Contributor

drewbo commented Apr 3, 2019

@wouellette giving a more thorough look now. Can you add a test case for this also? It breaks all of the tests now because of the country list change and change to the output; I can help with updating those

@wouellette
Copy link
Author

wouellette commented May 17, 2019

Better late than never, I updated the test cases, but the WMS endpoint is still failing because the usgs wms used for the test is not matching the fixture tile. I guess they updated the imagery in usgs to cause this mismatch?

Edit: I am working on the integration tests too FYI.

@drewbo
Copy link
Contributor

drewbo commented May 17, 2019

@wouellette awesome; if you pull in master I updated the WMS test to point to a new endpoint

@wouellette
Copy link
Author

The integration tests were also updated.

They all passed, but I had to update the label tile outputs as the former ones were reflecting the presence/absence with 1s and 0s of a class, whereas now it is the surface area which is represented (values other than 0s and 1s).

@drewbo
Copy link
Contributor

drewbo commented Jun 24, 2019

@wouellette thanks, I will likely get a chance to check this out this week; can you remove your venv files from the PR?

wouellette and others added 2 commits June 24, 2019 19:48
@drewbo
Copy link
Contributor

drewbo commented Jun 25, 2019

@wouellette gave a quick look today and looks really solid. Two other requests:

  • Can you add the ctr/ctr_idx iteration to the object detection and segmentation cases? Right now it looks like those would fail with two countries.
  • Related to the above, we should have test cases for multiple countries for all ml_types. Let me know if you'd like me to create any of those.

@wouellette
Copy link
Author

wouellette commented Jun 26, 2019

Regarding your two requests:

  1. The object detection and segmentation cases were adapted, the only thing was that the syntax for img.paste() was incorrect so I adapted it so if a .png has already been created, it pastes the box or the geometry in rather than overwrite the whole .png.

  2. I adapted the tests for an area across the border between Spain and Portugal for the standard ml_type cases (no parsing, country+bbox combo). Because the OSM QA tiles are buffered beyond the countries' borders, I had to take a larger bounding_box and take a lower zoom level to have a limited amount of outputs (24 tiles at zoom level 10). I kept the single country test at zoom level 17 for portugal for the sparse cases, as well as the geojson and aoi cases, so that both the single and double country cases were tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants