Add random_transform_generator for augmenting images. #190

de-vri-es · 2017-12-16T23:08:13Z

This PR adds a random_transform_generator to be used for augmenting data. It replaces the use of keras.preprocessing.image.ImageDataGenerator.

The main reason for doing this is that ImageDataGenerator creates a random transformation internally and applies it directly to the image. That doesn't give us the chance to apply the same transformation to bounding boxes.

The current workaround is to seed the global PRNG used by ImageDataGenerator with a random value and then have it transform first the image data, and then a mask for the bounding box (with the PRNG reset to the same random seed). The new bounding box is then determined from the transformed mask.

This is a rather expensive process, but also slightly fragile to changes in the ImageDataGenerator. It relies on the fact that ImageDataGenerator uses the global numpy PRNG, and that the number and order of generating random numbers is the same for the original image and the mask.

In contrast, the new random_transform_generator doesn't apply any transformations, it just generates them. The responsibility for applying the transforms to the image and bounding box has been moved into our own Generator with this PR.

random_transform_generator supports the following transformations:

rotation
translation
scaling
shearing
flips

Notably, the one thing from ImageDataGenerator.random_transform it does not support is channel shifting, since that is not a homogeneous linear transformation in 2D.

de-vri-es · 2017-12-16T23:14:33Z

keras_retinanet/preprocessing/generator.py

+            # Transform the bounding boxes in the annotations.
+            annotations = annotations.copy()
+            for index in range(annotations.shape[0]):
+                annotations[index, :4] = transform_aabb(transform, annotations[index, :4])


One thing I was wondering: What should the generator do if transform_aabb puts the annotation outside of the image canvas? I'm assuming they will automatically be filtered now by compute_input_output. But maybe I overlooked something?

Also, is that desirable? When the aabb ends up outside the image and it is filtered, that means that part of the image becomes background, even though it was actually an annotation.

Should we instead discard the entire image? Or should we try up to X times to get a transformation that doesn't put any annotations outside of the image (and discard the whole image if that failed X times)? Or should we clip the new bounding box with the image canvas?

My first thought is to clip the annotation to the image canvas, to preserve as much of the annotations as possible. What do you think ?

I'm not sure. I have a slight feeling that it might reduce the quality of training in some cases. Maybe some very distinctive part of the thing to recognize is clipped off, or maybe the clipped thing looks more like a different object (contrived example: a face with a clown nose vs a face with a normal nose).

I think filtering it (the annotation) out entirely is more harmful than clipping. So either we make sure it doesn't occur, or we clip it in my opinion.

de-vri-es · 2017-12-16T23:15:46Z

Note: should fix #68, #181 and I think #186.

de-vri-es · 2017-12-19T22:20:09Z

Note: due to the holiday period, reviews are going to be slightly delayed. That also goes for this PR.

hgaiser

Looks good! Have a few questions in the comments, but this will be very useful.

ps. do you have some time comparison of this method w.r.t. the method in Keras?

hgaiser · 2018-01-04T08:14:54Z

keras_retinanet/preprocessing/generator.py

+        image = self.preprocess_image(image)
+
+        # randomly transform both image and annotations
+        if self.transform_generator is not None:


Why not use if self.transform_generator: ?

No reason really. Do you prefer it like that?

hgaiser · 2018-01-04T08:16:05Z

keras_retinanet/preprocessing/generator.py

+            # Transform the bounding boxes in the annotations.
+            annotations = annotations.copy()
+            for index in range(annotations.shape[0]):
+                annotations[index, :4] = transform_aabb(transform, annotations[index, :4])


My first thought is to clip the annotation to the image canvas, to preserve as much of the annotations as possible. What do you think ?

hgaiser · 2018-01-04T08:21:24Z

keras_retinanet/utils/transform.py

+    return [min_corner[0], min_corner[1], max_corner[0], max_corner[1]]
+
+
+def _random_vector(min, max, prng = DEFAULT_PRNG):


Non-aligned default arguments are written down without spaces around the equal sign (same for some other places).

pep8 could be used for catching those spaces

One problem with using pytest --pep8 for that is that we do use spaces around the parameter assignment on multi-line function calls for alignment. The pep8 test doesn't distinguish between that and this type of whitespace.

Anyway, fixed, I think.

hgaiser · 2018-01-04T08:24:41Z

keras_retinanet/utils/transform.py

+    ])
+
+
+def random_translation(min, max, prng = DEFAULT_PRNG):


Do we want to distinguish between X/Y translation min/max?

For this module I would say yes. These are generic transform functions. My initial reasoning was that you might want to scale the max and min with the image width and height. That isn't necessary now since the data generator does the scaling itself after generating the transform.

Still, I think it's a valid use case for a generic transform function.

Damn you github, this comment thread is not outdated -.-

But this code handles X and Y the same way right? Using the same min/max?

no, min and max are 2D vectors (or a list/tuple/sequence of two elements)

Ah I see, I missed that part then. 👍

hgaiser · 2018-01-04T08:25:44Z

keras_retinanet/utils/transform.py

+def shear(amount):
+    """ Construct a homogeneous 2D shear matrix.
+    # Arguments
+        amount: the shear amount


Maybe make it a bit more clear what it is. Presumably it is the angle of the shear in radians?

It is an angle, though the matrix generated isn't actually a shear matrix. However, I copied it from keras since I figured we should probably use the same definition of shear as keras does, even if it is technically incorrect.

I believe the keras definition of shear compensates for the stretching that a real shear operation would do by reducing the Y component as the X component gets larger. So it's sort of halfway between a shear and a rotation.

Another thing I wasn't really happy about is that this shear function only shears parallel to the X axis. It's not very generic.

hgaiser · 2018-01-04T08:26:03Z

keras_retinanet/utils/transform.py

+def random_shear(min, max, prng = DEFAULT_PRNG):
+    """ Construct a random 2D shear matrix with shear angle between -max and max.
+    # Arguments
+        min:  the minumum shear factor.


hgaiser · 2018-01-04T08:26:40Z

keras_retinanet/utils/transform.py

+    ])
+
+
+def random_scaling(min, max, prng = DEFAULT_PRNG):


Same here, do we want to distinguish between x/y ?

hgaiser · 2018-01-05T10:59:19Z

As far as I can tell we only need to decide on what to do when an annotation is outside of the image canvas after transformation. When does this happen? Shouldn't the annotation always remain in the image canvas (aside from some rounding errors perhaps)?

de-vri-es · 2018-01-05T14:41:12Z

No, rotations can easily put annotations completely or partly out of the canvas. The same is true for shearing and translation. Only flips don't have this problem really.

yhenon · 2018-01-05T17:49:57Z

Two things:

Currently bounding boxes that overlap an anchor with 0.4 < IoU < 0.5 are tagged as don't care and do not contribute to the loss. It might make sense to consider doing something similar for boxes that are clipped by a transformation.
whatever the ultimate decision, we should let the threshold be user configurable.

hgaiser · 2018-01-15T14:14:20Z

Shall we move on to merging this, and worry about that edge case in a later PR?

ghost · 2018-01-15T14:17:03Z

If by the edge case you mean when bounding boxes end up outside the image, I think it should be fixed first. I have been getting that result with the current augmentations quite a lot, and it is a problem. I suppose clipping when possible, and not crashing otherwise would be my preferred solution.

hgaiser · 2018-01-15T14:18:40Z

Crashing is not supposed to happen. I propose to indeed either clip for now, or leave it untouched. Personally I still believe clipping would be fine.

ghost · 2018-01-15T14:20:05Z

Yeah I think crashing was actually fixed by a change a while back. I am in favour of merging this soon though! I have been oogling this branch for a while now, would be awesome to be able to use this!

de-vri-es · 2018-01-15T14:26:34Z

I also don't think it will crash now. I believe it will generate invalid annotations which will be filtered out automatically. For my part, we can make an issue to resolve the discarding/clipping problem and merge as is.

de-vri-es · 2018-01-15T14:38:21Z

I tagged a 0.1 pre-release, since we seemed to be pretty stable.

Rebased this branch on master. Once CI passes I'll merge.

de-vri-es · 2018-01-15T14:45:49Z

CI passed, merging.

de-vri-es · 2018-01-15T15:08:59Z

Discussion on what to do with the annotations partially transformed out of the canvas can continue at #223

Add random_transform_generator for augmenting images.

de-vri-es force-pushed the cheap-augmenting branch from b0b0ccf to 7c5622e Compare December 16, 2017 23:10

de-vri-es commented Dec 16, 2017

View reviewed changes

de-vri-es force-pushed the cheap-augmenting branch from 7c5622e to 99c1b7b Compare December 16, 2017 23:26

This was referenced Dec 16, 2017

Speed up augmentation of bounding boxes for non-trivial transforms #68

Closed

Concern around random number seeding #181

Closed

de-vri-es mentioned this pull request Dec 20, 2017

when I used the ImageDataGenerator,I encountered a error #186

Closed

hgaiser reviewed Jan 4, 2018

View reviewed changes

de-vri-es added 13 commits January 15, 2018 15:36

utils.transform: Add initial transform utilities.

e06e9e6

utils.transform: Add random_transform and friends.

5666e06

utils.transform: Add colvec() helper.

200685b

utils.transform: Add transform_aabb function.

79150d4

utils.transform: Rename transform_around to change_transform_origin.

b311108

Use utils.transform for image augmentation.

9063cbb

utils.transform: Improve documentation.

588f9c9

utils.transform: Split out random_flip.

6acf967

utils.transform: Use np.linalg.multi_dot.

e28fb7d

utils.transform: Add defaults to generate identity in random_transform.

7e05b02

Remove use of keras.preprocessing.image.ImageDataGenerator.

b097c96

utils.transform: Make random_transform_generator use a dedicated PRNG.

201eac8

utils.transform: Remove wrong bounds correction from transform_aabb.

3d8a1d5

de-vri-es added 5 commits January 15, 2018 15:37

utils.transform: Add tests.

c9798df

utils.transform: Use 1 dimensional vectors instead of column vectors.

ac16991

utils.transform: Add test for change_transform_origin.

e072b38

utils.transform: Remove spaces around '=' of default arguments.

b13e08f

preprocessing.generator: Check truthiness instead of is None.

ca6f5b6

de-vri-es force-pushed the cheap-augmenting branch from 122cc31 to ca6f5b6 Compare January 15, 2018 14:37

de-vri-es merged commit 8f22f35 into master Jan 15, 2018

de-vri-es mentioned this pull request Jan 15, 2018

Annotations partially transformed out of the image canvas #223

Open

de-vri-es deleted the cheap-augmenting branch January 15, 2018 15:07

kazushi-fa pushed a commit to kazushi-fa/keras-retinanet_rareplanes that referenced this pull request Aug 17, 2021

Merge pull request fizyr#190 from fizyr/cheap-augmenting

662db77

Add random_transform_generator for augmenting images.

		return [min_corner[0], min_corner[1], max_corner[0], max_corner[1]]


		def _random_vector(min, max, prng = DEFAULT_PRNG):

Add random_transform_generator for augmenting images. #190

Add random_transform_generator for augmenting images. #190

Conversation

de-vri-es commented Dec 16, 2017 • edited Loading

Choose a reason for hiding this comment

de-vri-es Dec 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hgaiser Jan 5, 2018 • edited Loading

Choose a reason for hiding this comment

de-vri-es commented Dec 16, 2017

de-vri-es commented Dec 19, 2017

hgaiser left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

de-vri-es Jan 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

de-vri-es Jan 4, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hgaiser commented Jan 5, 2018

de-vri-es commented Jan 5, 2018 • edited Loading

yhenon commented Jan 5, 2018 • edited Loading

hgaiser commented Jan 15, 2018

ghost commented Jan 15, 2018

hgaiser commented Jan 15, 2018

ghost commented Jan 15, 2018

de-vri-es commented Jan 15, 2018

de-vri-es commented Jan 15, 2018

de-vri-es commented Jan 15, 2018

de-vri-es commented Jan 15, 2018

de-vri-es commented Dec 16, 2017 •

edited

Loading

de-vri-es Dec 20, 2017 •

edited

Loading

hgaiser Jan 5, 2018 •

edited

Loading

hgaiser left a comment •

edited

Loading

de-vri-es Jan 4, 2018 •

edited

Loading

de-vri-es Jan 4, 2018 •

edited

Loading

de-vri-es commented Jan 5, 2018 •

edited

Loading

yhenon commented Jan 5, 2018 •

edited

Loading