This is a fork of the image_bbox_slicer package, designed to make the tiling more accurate, to avoid losing pixels on the edges of images, and to allow the user to sample some proportion of 'empty' tiles (tiles that do not include any object of interest). It also avoids creating tiles that will not be saved, to speed up the tiling operation.
CAVEATS: Currently the slice_by_size()
function is working (plus all of the functions that it depends on) but I haven't tested everything to make sure that I didn't break something else.
The main differences are:
- The original package discarded any pixels that fall outside an even multiple of the tile size. That wastes a lot of data if the tiles are large. Each image is now padded with zeros out to an even multiple of tile size before tiling it so no data is lost, and the padding works correctly if the images are of different sizes.
- The tile overlap math, tile size calculations, and row and column indexes were fixed to make them precisely correct (instead of various rough approximations, truncations, etc.);
- Fixed a problem with float values in annotations that caused a display error;
- Built in the capability to sample a variable proportion of empty tiles;
- Revamped tile naming so that tiles are named with row and column indexes to make future reassembly easier.
- Modified the code so tiles that will not be saved are not created in the first place, to save memory and CPU cycles;
- The tiled images display in the correct row and column relative to the original image, and show padding (the placement of tiles in the original package was approximate, relative to the source image).
- Images (but not yet annotations) can now be found by recursive search; i.e., the im_src directory can be pointed at a parent directory that contains subdirectories with images in them.
- Bounding boxes fragments that are smaller than tile overlap in the appropriate dimension can be excluded by setting
exclude_fragments=True
(see next)
The problem: When tiling images to generate training data, the associated bounding boxes must also be tiled. Splitting bounding boxes across tiles creates box fragments on one side of a tile boundary. When creating training data for a deep learning model, it is desirable to exclude small box fragments because they are very unlikely to contain a recognizable object. Overlapping the tiles does not eliminate the problem.
The method implemented here allows the user to ensure that no feature smaller than the tile overlap will be excluded, while getting rid of all smaller fragments.
In the diagram below, the large box is a tile and the overlap with adjacent tiles is shown by dotted lines. The exclude_fragments=True
option excludes the bounding boxes shown below in yellow (A, B, and C) from the tile. For simplicity, bounding boxes are only shown on the bottom and right of the tile in the image, but the behavior is identical for the top and left edges of the tile, respectively.
Decision rule: Given that a rectangular bounding box can have 1, 2, or 4 (but not 3) corners inside a rectangular tile if they are aligned:
- If 1 corner of the box is in the tile:
- if (box_w < tile_overlap_w) or (box_h < tile_overlap_h): discard the box
- If 2 corners of the box are in the tile:
- if the box is on the left or right side of the tile and (box_w < tile_overlap_w): discard
- if the box is on the top or bottom of the tile and (box_h < tile_overlap_h): discard
- Include the bounding box in all other cases.
In the figure, the green bounding boxes would all be included in the annotations for this tile. Box D will be recorded in this tile only; box E will be sliced in half and will be recorded (with overlap) in this tile and the tile below it. Box F will be recorded in both this tile and the tile below. Box G will appear on this tile and in the tile to the right (minus the portion to the left of the dotted line). Boxes A, B and C will not be recorded in this tile, but will be recorded on adjacent tiles.
This easy-to-use library is a data transformer sometimes useful in Object Detection tasks. It splits images and their bounding box annotations into tiles, both into specific sizes and into any arbitrary number of equal parts. It can also resize them, both by specific sizes and by a resizing/scaling factor. Read the docs here.
Currently, this library only supports bounding box annotations in PASCAL VOC format. And as of now, there is no command line execution support. Please raise an issue if needed.
UPDATE: This tool was only tested on Linux/Ubuntu. Please find a potential fix to make it work on Windows here.
#Install this fork:
$ pip install git+https://github.com/jcpayne/image_bbox_tiler@master
#The original package is on PyPI. To install it instead of this fork:
$ pip install image_bbox_slicer
Works with Python 3.4 and higher versions and requires:
Pillow==5.4.1
numpy==1.16.2
pascal-voc-writer==0.1.4
matplotlib==3.0.3
Note: This usage demo can be found in demo.ipynb
in the repo.
import image_bbox_tiler as ibs
You must configure paths to source and destination directories like the following.
im_src = './src/images'
an_src = './src/annotations'
im_dst = './dst/images'
an_dst = './dst/annotations'
slicer = ibs.Slicer()
slicer.config_dirs(img_src=im_src, ann_src=an_src,
img_dst=im_dst, ann_dst=an_dst)
The above images show the difference in slicing with and without partial labels. In the image on the left, all the box annotations masked in green are called Partial Labels.
Configure your slicer to either ignore or consider them by setting Slicer
object's keep_partial_labels
instance variable to True
or False
respectively. By default it is set to False
.
slicer.keep_partial_labels = True
An empty tile is a tile with no "labels" in it. The definition of "labels" here is tightly coupled with the user's preference of partial labels. If you choose to keep the partial labels (i.e. keep_partial_labels = True
), a tile with a partial label is not treated as empty. If you choose to not keep the partial labels (i.e. keep_partial_labels = False
), a tile with one or more partial labels is considered empty.
Configure your slicer to either ignore or consider empty tiles by setting Slicer
object's ignore_empty_tiles
instance variable to True
or False
respectively. By default it is set to True
.
New in image_bbox_tiler: you can sample a proportion of the empty tiles by setting 'empty_sample'= <a float in the range [0-1]>.
slicer.ignore_empty_tiles = False
You can choose to store the mapping between file names of the images before and after slicing by setting the Slicer
object's save_before_after_map
instance variable to True
. By default it is set to False
.
Typically, mapper.csv
looks like the following:
| old_name | new_names |
|------------|---------------------------------|
| 2102 | 000001, 000002, 000003, 000004 |
| 3931 | 000005, 000005, 000007, 000008 |
| test_image | 000009, 000010, 000011, 000012 |
| ... | ... |
slicer.save_before_after_map = True
slicer.slice_by_number(number_tiles=4)
slicer.visualize_sliced_random()
slicer.slice_by_size(tile_size=(418,279), tile_overlap=0)
slicer.visualize_sliced_random()
Note: visualize_sliced_random()
randomly picks a recently sliced image from the directory for plotting.
slicer.slice_images_by_number(number_tiles=4)
slicer.slice_images_by_size(tile_size=(418,279), tile_overlap=0)
slicer.slice_bboxes_by_number(number_tiles=4)
slicer.slice_bboxes_by_size(tile_size=(418,279), tile_overlap=0)
slicer.resize_by_size(new_size=(500,200))
slicer.visualize_resized_random()
slicer.resize_by_factor(resize_factor=0.05)
slicer.visualize_resized_random()
Note:
visualize_resized_random()
randomly picks a recently resized image from the destination directory for plotting.
slicer.resize_images_by_size(new_size=(500,200))
slicer.resize_images_by_factor(resize_factor=0.05)
slicer.resize_bboxes_by_size(new_size=(500,200))
slicer.resize_bboxes_by_factor(resize_factor=0.05)