Skip to content

Commit

Permalink
Refactor for publishing
Browse files Browse the repository at this point in the history
  • Loading branch information
fcdl94 authored and fcdl94 committed Nov 22, 2021
1 parent d3c0d1c commit e89439a
Show file tree
Hide file tree
Showing 20 changed files with 196 additions and 2,965 deletions.
119 changes: 109 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,114 @@
# FSS
## Few Shot Learning in Semantic Segmentation
# Prototype-based Incremental Few-Shot Semantic Segmentation
### Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 2021 (Poster) [Link](https://arxiv.org/abs/2012.01415)
#### Official PyTorch Implementation

# How to download data
![teaser](https://raw.githubusercontent.com/fcdl94/FSS/master/images/teaser.pdf)

> cd <target folder>
> ../data/download_voc.sh
# How to run the training
Semantic segmentation models have two fundamental weaknesses: i) they require large training sets with costly pixel-level annotations, and ii) they have a static output space, constrained to the classes of the training set. Toward addressing both problems, we introduce a new task, Incremental Few-Shot Segmentation (iFSS). The goal of iFSS is to extend a pretrained segmentation model with new classes from few annotated images and without access to old training data. To overcome the limitations of existing models iniFSS, we propose Prototype-based Incremental Few-Shot Segmentation (PIFS) that couples prototype learning and knowledge distillation. PIFS exploits prototypes to initialize the classifiers of new classes, fine-tuning the network to refine its features representation. We design a prototype-based distillation loss on the scores of both old and new class prototypes to avoid overfitting and forgetting, and batch-renormalization to cope with non-i.i.d.few-shot data. We create an extensive benchmark for iFSS showing that PIFS outperforms several few-shot and incremental learning methods in all scenarios.

> python -m torch.distributed.launch --nproc_per_node="total GPUs" train.py --data_root "folder where you downloaded the data" --name "name of exp" --batch_size=4 --num_workers=1 --other_args
![method](https://raw.githubusercontent.com/fcdl94/FSS/master/images/method.pdf)

The default folder for the logs is logs/"name of exp". The log is in the format of tensorboard.
## How to run
### Requirements
We have simple requirements:
The main requirements are:
```
python > 3.1
pytorch > 1.6
```
If you want to install a custom environment for this codce, you can run the following using [conda](https://docs.conda.io/projects/conda/en/latest/commands/install.html):
```
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
conda install tensorboard
conda install jupyter
conda install matplotlib
conda install tqdm
conda install imageio
The default is to use a pretraining for the backbone used, that is searched in the pretrained folder of the project. If you don't want to use pretrained, please use --no-pretrained.
pip install inplace-abn
conda install -c conda-forge pickle5
```

### Datasets
In the benchmark there are two datasets: Pascal-VOC 2012 and COCO (object only).
For the COCO dataset, we followed the COCO-stuff splits and annotations, that you can see [here](https://github.com/nightrome/cocostuff/).

To download dataset, follow the scripts: `data/download_voc.sh`, `data/download_coco.sh`

To use the annotations of COCO-Stuff in our setting, you should preprocess it by running the provided script. \
Please, remember to change the path in the script before launching it!
`python data/coco/make_annotation.py`

Finally, if your datasets are in a different folder, make a soft-link from the target dataset to the data folder.
We expect the following tree:
```
/data/voc/dataset
/annotations
<Image-ID>.png
/images
<Image-ID>.png
/data/coco/dataset
/annotations
/train2017
<Image-ID>.png
/val2017
<Image-ID>.png
/images
/train2017
<Image-ID>.png
/val2017
<Image-ID>.png
```

### ImageNet Pretrained Models
After setting the dataset, you download the models pretrained on ImageNet using [InPlaceABN](https://github.com/mapillary/inplace_abn).
[Download](https://drive.google.com/file/d/1rQd-NoZuCsGZ7_l_X9GO1GGiXeXHE8CT/view) the ResNet-101 model (we only need it but you can also [download other networks](https://github.com/mapillary/inplace_abn) if you want to change it).
Then, put the pretrained model in the `pretrained` folder.


### Run!
We provide different scripts to run the experiments (see `run` folder).
In the following, we describe the basic structure of them.

First, you should run the base step (or step 0).
```
exp --method FT --name FT --epochs 30 --lr 0.01 --batch_size 24
```
In this example, we are running the fine-tuning method (FT). For other methods (COS, SPN, DWI, RT) you can change the method name.
WI and PIFS rely on the COS in the step 0, while FT, AMP, LWF, ILT, MIB rely on the FT one.

After this, you can run the incremental steps.
There are a few options: (i) the task, (ii) the number of images (n_shot), and (iii) the sampling split (i_shot).

i) The list of tasks is:
```
voc:
5-0, 5-1, 5-2, 5-3
coco:
20-0, 20-1, 20-2, 20-3
```
For multi-step, you can append an `m` after the task (e.g., `5-0m`)

ii) We tested 1, 2, and 5 shot. You can specify it with the `nshot` option.

iii) We used three random sampling. You can specify it with the `ishot` option.

The training will produce both an output on the terminal and it will log on tensorboard at the `logs/<Exp_Name>` folder.
After the training, it will append a row in the csv file `logs/results/<dataset>/<task>.csv`.

## Qualitative Results
![qual-voc](https://raw.githubusercontent.com/fcdl94/FSS/master/images/qual_voc2.pdf)
![qual-coco](https://raw.githubusercontent.com/fcdl94/FSS/master/images/qual_coco2.pdf)

## Cite us!
Please, cite the following article when referring to this code/method.
```
@InProceedings{cermelli2020prototype,
title={Prototype-based Incremental Few-Shot Semantic Segmentation },
author={Cermelli, Fabio and Mancini, Massimiliano and Xian, Yongqin and Akata, Zeynep and Caputo, Barbara},
booktitle={Proceedings of the 32nd British Machine Vision Conference},
month={November},
year={2021}
}
```
63 changes: 11 additions & 52 deletions argparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,9 @@ def modify_command_options(opts):
if opts.backbone is None:
opts.backbone = 'resnet101'

if opts.method == "GIFS":
if opts.method == "PIFS":
opts.method = "WI"
opts.norm_act = "iabr"
# opts.l2_loss = 0.1 if opts.l2_loss == 0 else opts.l2_loss
opts.loss_kd = 10
opts.dist_warm_start = True
elif opts.method == 'LWF':
Expand All @@ -38,7 +37,7 @@ def modify_command_options(opts):
opts.train_only_novel = True
opts.train_only_classifier = True
opts.method = "FT"
opts.lr_cls = 1 # need to check!
opts.lr_cls = 1
elif opts.method == 'AFHN' and opts.step > 0:
opts.train_only_novel = True
opts.train_only_classifier = True
Expand Down Expand Up @@ -66,15 +65,14 @@ def get_argparser():
help="random seed (default: 42)")
parser.add_argument("--num_workers", type=int, default=2,
help='number of workers (default: 2)')
parser.add_argument('--opt_level', type=str, choices=['O0', 'O1', 'O2', 'O3'], default='O0')
parser.add_argument("--device", type=int, default=None,
help='Specify the device you want to use.')

# Dataset Options
parser.add_argument("--data_root", type=str, default="data",
help="path to Dataset")
parser.add_argument("--dataset", type=str, default='voc',
choices=['voc', 'cts', 'coco', 'coco-stuff'], help='Name of dataset')
choices=['voc', 'coco', 'coco-stuff'], help='Name of dataset')

# Task Options
parser.add_argument("--step", type=int, default=0,
Expand Down Expand Up @@ -162,10 +160,6 @@ def get_argparser():
help='Use this to enable last BN+ReLU on Deeplab-v3 (def. False)')
parser.add_argument("--no_pooling", default=False, action='store_true',
help='Use this to DIS-enable Pooling in Deeplab-v3 (def. False)')
parser.add_argument("--hnm", default=False, action='store_true',
help='Use this to enable Hard Negative Mining (def. False)')
parser.add_argument("--focal", default=False, action='store_true',
help='Use this to enable Focal Loss (def. False)')

# Test and Checkpoint options
parser.add_argument("--test", action='store_true', default=False,
Expand All @@ -179,6 +173,7 @@ def get_argparser():
parser.add_argument("--cross_val", action='store_true', default=False,
help="If validate on training or on validation (default: Val)")

# Checkpoint to start in IL steps
parser.add_argument("--step_ckpt", default=None, type=str,
help="path to trained model at previous step. Leave it None if you want to use def path")

Expand All @@ -188,6 +183,7 @@ def get_argparser():
parser.add_argument("--embedding", type=str, default="fastnvec", choices=['word2vec', 'fasttext', 'fastnvec'])
parser.add_argument("--amp_alpha", type=float, default=0.25,
help='Alpha value for the proxy adaptation.')
# parameters for IL methods
parser.add_argument("--mib_ce", default=False, action='store_true',
help='Use the MiB classification loss (Def No)')
parser.add_argument("--init_mib", default=False, action='store_true',
Expand All @@ -196,22 +192,17 @@ def get_argparser():
help='The MiB distillation loss strength (Def 0.)')
parser.add_argument("--loss_kd", default=0, type=float,
help='The distillation loss strength (Def 0.)')
parser.add_argument("--ort_proto", default=0, type=float,
help='The ORT*PROTO loss strength (Def 0.)')
parser.add_argument("--kd_alpha", default=1, type=float,
help='The temperature vale (Def 1.)')
help='The temperature value of KD loss (Def 1.)')
# other distillation choices on features
parser.add_argument("--l2_loss", default=0, type=float,
help='The MSE feature loss strength (Def 0.)')
help='The MSE feature (Deeplab-output) loss strength (Def 0.)')
parser.add_argument("--loss_de", default=0, type=float,
help='The MSE-body feature loss strength (Def 0.)')
help='The MSE on body (resnet-output) feature loss strength (Def 0.)')
parser.add_argument("--l1_loss", default=0, type=float,
help='The L1 feature loss strength (Def 0.)')
parser.add_argument("--cos_loss", default=0, type=float,
help='The feature loss strength (Def 0.)')
parser.add_argument("--bkg_dist", default=0, type=float,
help='The feature loss strength (Def 0.)')
parser.add_argument("--kl_div", default=False, action='store_true',
help='Use true KL loss and not the CE loss.')
help='The Cosine distillation on feature loss strength (Def 0.)')
parser.add_argument("--ckd", default=False, action='store_true',
help='Use cosine KD loss and not the CE loss.')
parser.add_argument("--dist_warm_start", default=False, action='store_true',
Expand All @@ -225,42 +216,10 @@ def get_argparser():
help="Train only the classifier of current step (default: False)")
parser.add_argument("--bn_momentum", default=None, type=float,
help="The BN momentum (Set to 0.1 to update of running stats of ABR.)")

# Parameters for DWI
parser.add_argument("--dyn_lr", default=1., type=float,
help='LR for DynWI (Def 1)')
parser.add_argument("--dyn_iter", default=1000, type=int,
help='Iterations for DynWI (Def 1000)')

parser.add_argument("--gen_acloss", action='store_true', default=False,
help='Use BKG loss for generation (Def False)')
parser.add_argument("--gen_lr", default=0.00001, type=float,
help='LR for Generation (Def 1e-5)')
parser.add_argument("--gen_alpha", default=1., type=float,
help='CrossEntropy Weight (Def 1)')
parser.add_argument("--gen_iter", default=10000, type=int,
help='Iterations for Generation (Def 1e5)')
parser.add_argument("--gen_ncritic", default=5, type=int,
help='Number of critic iterations (Def 5)')
parser.add_argument("--gen_pixtopix", action='store_true', default=False,
help='Use PixToPix Generator')
parser.add_argument("--gen_fgpp", action='store_true', default=False,
help='Use Feature Generator ++')
parser.add_argument("--gen_cond_gan", action='store_true', default=False,
help='Use Conditional GAN Discriminator')
parser.add_argument("--gen_mib", action='store_true', default=False,
help='Use MiB CrossEntropy on Generator.')
parser.add_argument("--gen_nlayer", default=0, type=int,
help='Number of Res layers to use in generator.')
parser.add_argument("--ngf", default=64, type=int,
help='Feature Generator Size (def 64)')
parser.add_argument("--type", default=3, type=int,
help='Type of generator input.')
# to remove
parser.add_argument("--pixel_imprinting", action='store_true', default=False,
help="Use only a pixel for imprinting when with WI (default: False)")
parser.add_argument("--weight_mix", action='store_true', default=False,
help="When doing WI, sum to proto the mix of old weights (default: False)")



return parser
4 changes: 2 additions & 2 deletions coco.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/bin/bash

export CUDA_VISIBLE_DEVICES=$1
port=$2
alias exp="python -m torch.distributed.launch --master_port ${port} --nproc_per_node=1 run.py --opt_level O0"
port=$(python get_free_port.py)
alias exp="python -m torch.distributed.launch --master_port ${port} --nproc_per_node=1 run.py"
shopt -s expand_aliases

task=$3 # 20-0, 20-1, 20-2, 20-3
Expand Down
Loading

0 comments on commit e89439a

Please sign in to comment.