-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AI-1260][internal] add loading of polygon support for object detection datasets #679
[AI-1260][internal] add loading of polygon support for object detection datasets #679
Conversation
…-of-polygon-support-for-object-detection-datasets
…bject-detection datasets, also added better error msg
AI-1260 Add loading of polygon support for object-detection datasets in darwin-py
The polygon datasets has bounding box annotations but those can not be loaded with obejct-detection dataloades currently in darwin-py. Lets add the feature to load BOTH object-detection and instance-segmentation annoations with the object-detection dataloader. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes, and one query
017b1c0
to
99cb219
Compare
…-of-polygon-support-for-object-detection-datasets
…oad from multiple types, local_dataset has the logic to add bonding_boxes
…-of-polygon-support-for-object-detection-datasets
@@ -64,20 +63,6 @@ def __init__( | |||
split_type: str = "random", | |||
release_name: Optional[str] = None, | |||
): | |||
assert dataset_path is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring the init function to make ruff happy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this refactoring - thanks ruff!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The major changes is that extract_classes and get_classes has the same default behavior.
FThe logic of loading polygon annotations with bounding_box are placed a level higher in stratified sampling and local_dataset. This is done by enabling loading of a list of annotation_types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sound to me, I think QA needs to be solid, and Jon should ideally have a glance, because this changes code I'm not as familiar with, that he probably wrote :)
raise ValueError(f"Could not find any {SUPPORTED_IMAGE_EXTENSIONS} file", f" in {images_dir}") | ||
|
||
assert len(self.images_path) == len(self.annotations_path) | ||
def _initial_setup(self, dataset_path, release_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the extraction here, even if it is only to make Ruff happy!
@@ -238,102 +285,97 @@ def get_coco_format_record( | |||
image_id: Optional[Union[str, int]] = None, | |||
classes: Optional[List[str]] = None, | |||
) -> Dict[str, Any]: | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason behind removing the docblock? Internal function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's strange, its there in my code. Will see if I can get it back on the github as well.
@@ -93,7 +100,10 @@ def extract_classes(annotations_path: Path, annotation_type: str) -> Tuple[Dict[ | |||
continue | |||
|
|||
for annotation in annotation_file.annotations: | |||
if annotation.annotation_class.annotation_type != annotation_type: | |||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this may be inconsistent with this logic. Here it seems "complex_polygon" will not be considered in creating the class list (based on how this is called in make_class_lists), while in the linked logic "complex_polygon" is considered for an annotation.annotation_class.annotation_type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure which is better: remove complex_polygon from linked logic or add it in make_class_lists
Unless this is an intentional difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great input, I have not changed anything considering that logic. So the behavior should be the same as main darwin-py in that regard. BUT, that does not stop us from fixing it if there is an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest, we skip this for now. It might be a bug, but it would be a pre-existing condition and we have the same behavior as before. Besides, I think that complex-polygons are getting deprecated.
We could raise a bug report thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Disclamer: really long PR to review, so I might have missed things. Haven't tested it myself.
@@ -64,20 +63,6 @@ def __init__( | |||
split_type: str = "random", | |||
release_name: Optional[str] = None, | |||
): | |||
assert dataset_path is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this refactoring - thanks ruff!
Problem
In the initial implementation, when loading classes for datasets labeled with the
bounding_box
annotation type, it wasn't taking into consideration that thepolygon
annotations could also be relevant because they contain bounding box information in their annotations.Solution
We modified the load_classes function to incorporate the following logic:
Aka, we now load BOTH bounding_box and polygon classes that has been exported per default.
Changelog