SCD: A Stacked Carton Dataset for Detection and Segmentation

Jinrong Yang¹ Shengkai Wu¹ Lijun Gou¹ Hangcheng Yu¹ Chenxi Lin¹ Jiazhuo Wang¹ Pan Wang¹ Minxuan Li² Xiaoping Li¹

¹State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, China.
²Faculty of Arts and Science, Queen’s University, Canada

Abstract | Paper | SCD | Network | Leaderboard | Attention

1. Abstract

Carton detection is an important technique in the automatic logistics system and can be applied to many applications such as the stacking and unstacking of cartons, the unloading of cartons in the containers. However, there is no public large-scale carton dataset for the research community to train and evaluate the carton detection models up to now, which hinders the development of carton detection. In this paper, we present a large-scale carton dataset named Stacked Carton Dataset(SCD) with the goal of advancing the state-of-the-art in carton detection. Images are collected from the internet and several warehourses, and objects are labeled using per-instance segmentation for precise localization. There are totally 250,000 instance masks from 16,136 images. In addition, we design a carton detector based on RetinaNet by embedding Boundary Guided Supervision module(BGS) and Offset Prediction between Classification and Localization module(OPCL). OPCL alleviates the imbalance problem between classification and localization quality which boosts AP by 3.1% ~ 4.7% on SCD while BGS guides the detector to pay more attention to boundary information of cartons and decouple repeated carton textures. To demonstrate the generalization of OPCL to other datasets, we conduct extensive experiments on MS COCO and PASCAL VOC. The improvements of AP on MS COCO and PASCAL VOC are 1.8% ~ 2.2% and 3.4% ~ 4.3% respectively.

2. Paper

Paper on arXiv => "SCD: A Stacked Carton Dataset for Detection and Segmentation"

3 SCD

3.1 Dataset license

CC BY-NC-SA 4.0

3.2 Image examples

3.3 Annotations

Example of instance annotation in SCD. The first line represents the style of four labels with respect to LSCD while the second line illustrates the style of one label in OSCD. In terms of first line, blue, green, red and yellow represent Carton-inner-all, Carton-innerocclusion, Carton-outer-al and Carton-outer-occlusion respectively.

3.4 Overview infomation of SCD

Dataset	Images	Split(training/test set)	Labels	All/Occlusion	Inner/Outer	Total Instances	Average Instances
LSCD	7,735	6,735/1,000	4&1	√	√	81,870	10.58
OSCD	8,401	7,401/1,000	1	×	×	168,748	20.09

3.5 Data classification and download link

OSCD:

(1) OSCD => "Images and COCO-style labels" (password: XXXX)

LSCD:

(1) LSCD => "Images and LabelMe-style labels" (password: XXXX)

(2) LSCD => "Images and COCO-style labels(containing Carton-inner-all, Carton-inner-occlusion, Carton-outer-all and Carton-outer-occlusion)" (password: XXXX)

(3) LSCD => "Images and COCO-style labels(only containing carton)" (password: XXXX)

*Notice: You should download the dataset using Baidu Drive. You can email us to request data and clarify your purpose, we will give you the password within 3 days.([email protected], [email protected])

3.6 Dataset statistics

The first line represents the statistical distribution of LSCD while the second line represents the statistical distribution of OSCD. The chart calculates the width, height, aspect ratio, pixel area and the number of objects in each image from left to right. Noting that the width, height and area of instance are all normalized by the width and height of corresponding image. Log function is adopted to normalize aspect ratio.

4. Proposed baseline method on SCD

4.1 RetinaNet with OPCL and BGS

4.2 Baseline

Dataset	Labels	Model(training/test set)	mAP	AP50	AP75
OSCD	1	RetinaNet	72.1	90.8	80.5
OSCD	1	RetinaNet+	76.6	91.8	83.6
OSCD	1	FCOS	72.8	91.1	80.6
OSCD	1	Faster R-CNN	69.0	90.1	77.8
LSCD	1	RetinaNet	79.8	95.2	87.9
LSCD	1	RetinaNet+	84.7	95.8	89.8
LSCD	1	FCOS	76.5	93.7	84.3
LSCD	1	Faster R-CNN	77.5	94.5	86.3
LSCD	4	RetinaNet	65.7	80.4	73.0
LSCD	4	RetinaNet+	69.9	80.0	74.9
LSCD	4	FCOS	68.1	81.2	74.8
LSCD	4	Faster R-CNN	61.2	79.5	70.1
LSCD+OSCD	1	RetinaNet	82.0	95.9	89.8
LSCD+OSCD	1	RetinaNet+	86.1	96.3	91.2
LSCD+OSCD	1	FCOS	83.8	96.2	90.4
LSCD+OSCD	1	Faster R-CNN	80.6	95.7	89.2
LSCD+OSCD	4	RetinaNet	67.4	80.8	74.1
LSCD+OSCD	4	RetinaNet+	71.5	80.9	76.4
LSCD+OSCD	4	FCOS	71.1	82.0	76.8
LSCD+OSCD	4	Faster R-CNN	64.7	81.2	73.7

Comparison of detection performance between three state-ofthe- art methods on SCD. For the evaluation of LSCD, 1 and 4 labels are all evaluated. LSCD+OSCD means detector are firstly pre-trained in OSCD and then finetuned in LSCD. RetinaNet+ represents GIoU loss is used.

4.3 Main results

Main results of RetinaNet with all our proposed modules. ”pretrain” means pretraining identity model on OSCD and fine-tuning on LSCD with the image scale of [600,1000]([800,1333]†). ”1x” means the model is trained for total 12 epochs.

5. Leaderboard

SCD-Leaderboard

If you have been successful in creating a model based on the training set and it performs well on the validation set, we encourage you to run your model on the test set. You can submit your results on the SCD leaderboard by creating a new issue. Your results will be ranked in the leaderboard and to benchmark your approach against that of other machine learners. We are looking forward to your submission. Please click here to submit.

6. ATTN

The data set is free for academic use but please do not use it for commercial purposes. You can run them at your own risk. For other purposes, please contact the corresponding author Pan Wang or Jinrong Yang ([email protected], [email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
SCD-leaderboard		SCD-leaderboard
images		images
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
index.md		index.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCD: A Stacked Carton Dataset for Detection and Segmentation

Abstract | Paper | SCD | Network | Leaderboard | Attention

1. Abstract

2. Paper

3 SCD

3.1 Dataset license

3.2 Image examples

3.3 Annotations

3.4 Overview infomation of SCD

3.5 Data classification and download link

3.6 Dataset statistics

4. Proposed baseline method on SCD

4.1 RetinaNet with OPCL and BGS

4.2 Baseline

4.3 Main results

5. Leaderboard

6. ATTN

About

Releases

Packages

License

panwangaz/scd.github.io

Folders and files

Latest commit

History

Repository files navigation

SCD: A Stacked Carton Dataset for Detection and Segmentation

Abstract | Paper | SCD | Network | Leaderboard | Attention

1. Abstract

2. Paper

3 SCD

3.1 Dataset license

3.2 Image examples

3.3 Annotations

3.4 Overview infomation of SCD

3.5 Data classification and download link

3.6 Dataset statistics

4. Proposed baseline method on SCD

4.1 RetinaNet with OPCL and BGS

4.2 Baseline

4.3 Main results

5. Leaderboard

6. ATTN

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages