Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random_transform_generator for augmenting images. #190

Merged
merged 18 commits into from
Jan 15, 2018

Conversation

de-vri-es
Copy link
Contributor

@de-vri-es de-vri-es commented Dec 16, 2017

This PR adds a random_transform_generator to be used for augmenting data. It replaces the use of keras.preprocessing.image.ImageDataGenerator.

The main reason for doing this is that ImageDataGenerator creates a random transformation internally and applies it directly to the image. That doesn't give us the chance to apply the same transformation to bounding boxes.

The current workaround is to seed the global PRNG used by ImageDataGenerator with a random value and then have it transform first the image data, and then a mask for the bounding box (with the PRNG reset to the same random seed). The new bounding box is then determined from the transformed mask.

This is a rather expensive process, but also slightly fragile to changes in the ImageDataGenerator. It relies on the fact that ImageDataGenerator uses the global numpy PRNG, and that the number and order of generating random numbers is the same for the original image and the mask.

In contrast, the new random_transform_generator doesn't apply any transformations, it just generates them. The responsibility for applying the transforms to the image and bounding box has been moved into our own Generator with this PR.

random_transform_generator supports the following transformations:

  • rotation
  • translation
  • scaling
  • shearing
  • flips

Notably, the one thing from ImageDataGenerator.random_transform it does not support is channel shifting, since that is not a homogeneous linear transformation in 2D.

# Transform the bounding boxes in the annotations.
annotations = annotations.copy()
for index in range(annotations.shape[0]):
annotations[index, :4] = transform_aabb(transform, annotations[index, :4])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I was wondering: What should the generator do if transform_aabb puts the annotation outside of the image canvas? I'm assuming they will automatically be filtered now by compute_input_output. But maybe I overlooked something?

Copy link
Contributor Author

@de-vri-es de-vri-es Dec 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is that desirable? When the aabb ends up outside the image and it is filtered, that means that part of the image becomes background, even though it was actually an annotation.

Should we instead discard the entire image? Or should we try up to X times to get a transformation that doesn't put any annotations outside of the image (and discard the whole image if that failed X times)? Or should we clip the new bounding box with the image canvas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought is to clip the annotation to the image canvas, to preserve as much of the annotations as possible. What do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I have a slight feeling that it might reduce the quality of training in some cases. Maybe some very distinctive part of the thing to recognize is clipped off, or maybe the clipped thing looks more like a different object (contrived example: a face with a clown nose vs a face with a normal nose).

Copy link
Contributor

@hgaiser hgaiser Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think filtering it (the annotation) out entirely is more harmful than clipping. So either we make sure it doesn't occur, or we clip it in my opinion.

@de-vri-es
Copy link
Contributor Author

Note: should fix #68, #181 and I think #186.

@de-vri-es
Copy link
Contributor Author

Note: due to the holiday period, reviews are going to be slightly delayed. That also goes for this PR.

Copy link
Contributor

@hgaiser hgaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Have a few questions in the comments, but this will be very useful.

ps. do you have some time comparison of this method w.r.t. the method in Keras?

image = self.preprocess_image(image)

# randomly transform both image and annotations
if self.transform_generator is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use if self.transform_generator: ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason really. Do you prefer it like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

# Transform the bounding boxes in the annotations.
annotations = annotations.copy()
for index in range(annotations.shape[0]):
annotations[index, :4] = transform_aabb(transform, annotations[index, :4])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought is to clip the annotation to the image canvas, to preserve as much of the annotations as possible. What do you think ?

return [min_corner[0], min_corner[1], max_corner[0], max_corner[1]]


def _random_vector(min, max, prng = DEFAULT_PRNG):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-aligned default arguments are written down without spaces around the equal sign (same for some other places).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 could be used for catching those spaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with using pytest --pep8 for that is that we do use spaces around the parameter assignment on multi-line function calls for alignment. The pep8 test doesn't distinguish between that and this type of whitespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, fixed, I think.

])


def random_translation(min, max, prng = DEFAULT_PRNG):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to distinguish between X/Y translation min/max?

Copy link
Contributor Author

@de-vri-es de-vri-es Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this module I would say yes. These are generic transform functions. My initial reasoning was that you might want to scale the max and min with the image width and height. That isn't necessary now since the data generator does the scaling itself after generating the transform.

Still, I think it's a valid use case for a generic transform function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn you github, this comment thread is not outdated -.-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this code handles X and Y the same way right? Using the same min/max?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, min and max are 2D vectors (or a list/tuple/sequence of two elements)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, I missed that part then. 👍

def shear(amount):
""" Construct a homogeneous 2D shear matrix.
# Arguments
amount: the shear amount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make it a bit more clear what it is. Presumably it is the angle of the shear in radians?

Copy link
Contributor Author

@de-vri-es de-vri-es Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an angle, though the matrix generated isn't actually a shear matrix. However, I copied it from keras since I figured we should probably use the same definition of shear as keras does, even if it is technically incorrect.

I believe the keras definition of shear compensates for the stretching that a real shear operation would do by reducing the Y component as the X component gets larger. So it's sort of halfway between a shear and a rotation.

Another thing I wasn't really happy about is that this shear function only shears parallel to the X axis. It's not very generic.

def random_shear(min, max, prng = DEFAULT_PRNG):
""" Construct a random 2D shear matrix with shear angle between -max and max.
# Arguments
min: the minumum shear factor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

])


def random_scaling(min, max, prng = DEFAULT_PRNG):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, do we want to distinguish between x/y ?

@hgaiser
Copy link
Contributor

hgaiser commented Jan 5, 2018

As far as I can tell we only need to decide on what to do when an annotation is outside of the image canvas after transformation. When does this happen? Shouldn't the annotation always remain in the image canvas (aside from some rounding errors perhaps)?

@de-vri-es
Copy link
Contributor Author

de-vri-es commented Jan 5, 2018

No, rotations can easily put annotations completely or partly out of the canvas. The same is true for shearing and translation. Only flips don't have this problem really.

@yhenon
Copy link
Contributor

yhenon commented Jan 5, 2018

Two things:

  • Currently bounding boxes that overlap an anchor with 0.4 < IoU < 0.5 are tagged as don't care and do not contribute to the loss. It might make sense to consider doing something similar for boxes that are clipped by a transformation.
  • whatever the ultimate decision, we should let the threshold be user configurable.

@hgaiser
Copy link
Contributor

hgaiser commented Jan 15, 2018

Shall we move on to merging this, and worry about that edge case in a later PR?

@ghost
Copy link

ghost commented Jan 15, 2018

If by the edge case you mean when bounding boxes end up outside the image, I think it should be fixed first. I have been getting that result with the current augmentations quite a lot, and it is a problem. I suppose clipping when possible, and not crashing otherwise would be my preferred solution.

@hgaiser
Copy link
Contributor

hgaiser commented Jan 15, 2018

Crashing is not supposed to happen. I propose to indeed either clip for now, or leave it untouched. Personally I still believe clipping would be fine.

@ghost
Copy link

ghost commented Jan 15, 2018

Yeah I think crashing was actually fixed by a change a while back. I am in favour of merging this soon though! I have been oogling this branch for a while now, would be awesome to be able to use this!

@de-vri-es
Copy link
Contributor Author

I also don't think it will crash now. I believe it will generate invalid annotations which will be filtered out automatically. For my part, we can make an issue to resolve the discarding/clipping problem and merge as is.

@de-vri-es
Copy link
Contributor Author

I tagged a 0.1 pre-release, since we seemed to be pretty stable.

Rebased this branch on master. Once CI passes I'll merge.

@de-vri-es
Copy link
Contributor Author

CI passed, merging.

@de-vri-es
Copy link
Contributor Author

Discussion on what to do with the annotations partially transformed out of the canvas can continue at #223

kazushi-fa pushed a commit to kazushi-fa/keras-retinanet_rareplanes that referenced this pull request Aug 17, 2021
Add random_transform_generator for augmenting images.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants