Refactor for publishing

fcdl94 · Nov 22, 2021 · e89439a · e89439a
1 parent d3c0d1c
commit e89439a
Show file tree

Hide file tree

Showing 20 changed files with 196 additions and 2,965 deletions.
diff --git a/README.md b/README.md
@@ -1,15 +1,114 @@
-# FSS
-## Few Shot Learning in Semantic Segmentation
+# Prototype-based Incremental Few-Shot Semantic Segmentation 
+### Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 2021 (Poster) [Link](https://arxiv.org/abs/2012.01415)
+#### Official PyTorch Implementation
 
-# How to download data
+![teaser](https://raw.githubusercontent.com/fcdl94/FSS/master/images/teaser.pdf)
 
-> cd <target folder>
-> ../data/download_voc.sh
-  
-# How to run the training
+Semantic segmentation models have two fundamental weaknesses: i) they require large training sets with costly pixel-level annotations, and ii) they have a static output space, constrained to the classes of the training set. Toward addressing both problems, we introduce a new task, Incremental Few-Shot Segmentation (iFSS). The goal of iFSS is to extend a pretrained segmentation model with new classes from few annotated images and without access to old training data. To overcome the limitations of existing models iniFSS, we propose Prototype-based Incremental Few-Shot Segmentation (PIFS) that couples prototype learning and knowledge distillation. PIFS exploits prototypes to initialize the classifiers of new classes, fine-tuning the network to refine its features representation. We design a prototype-based distillation loss on the scores of both old and new class prototypes to avoid overfitting and forgetting, and batch-renormalization to cope with non-i.i.d.few-shot data. We create an extensive benchmark for iFSS showing that PIFS outperforms several few-shot and incremental learning methods in all scenarios.
 
-> python -m torch.distributed.launch --nproc_per_node="total GPUs" train.py --data_root "folder where you downloaded the data" --name "name of exp" --batch_size=4 --num_workers=1 --other_args
+![method](https://raw.githubusercontent.com/fcdl94/FSS/master/images/method.pdf)
 
-The default folder for the logs is logs/"name of exp". The log is in the format of tensorboard.
+## How to run
+### Requirements
+We have simple requirements:
+The main requirements are:
+```
+python > 3.1
+pytorch > 1.6
+```
+If you want to install a custom environment for this codce, you can run the following using [conda](https://docs.conda.io/projects/conda/en/latest/commands/install.html):
+```
+conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
+conda install tensorboard
+conda install jupyter
+conda install matplotlib
+conda install tqdm
+conda install imageio
 
-The default is to use a pretraining for the backbone used, that is searched in the pretrained folder of the project. If you don't want to use pretrained, please use --no-pretrained.
+pip install inplace-abn
+conda install -c conda-forge pickle5
+```
+
+### Datasets 
+In the benchmark there are two datasets: Pascal-VOC 2012 and COCO (object only).
+For the COCO dataset, we followed the COCO-stuff splits and annotations, that you can see [here](https://github.com/nightrome/cocostuff/).
+
+To download dataset, follow the scripts: `data/download_voc.sh`, `data/download_coco.sh` 
+
+To use the annotations of COCO-Stuff in our setting, you should preprocess it by running the provided script. \
+Please, remember to change the path in the script before launching it!
+`python data/coco/make_annotation.py`
+
+Finally, if your datasets are in a different folder, make a soft-link from the target dataset to the data folder.
+We expect the following tree:
+```
+/data/voc/dataset
+    /annotations
+        <Image-ID>.png
+    /images
+        <Image-ID>.png
+        
+/data/coco/dataset
+    /annotations
+        /train2017
+            <Image-ID>.png
+        /val2017
+            <Image-ID>.png
+    /images
+        /train2017
+            <Image-ID>.png
+        /val2017
+            <Image-ID>.png
+```
+
+### ImageNet Pretrained Models
+After setting the dataset, you download the models pretrained on ImageNet using [InPlaceABN](https://github.com/mapillary/inplace_abn).
+[Download](https://drive.google.com/file/d/1rQd-NoZuCsGZ7_l_X9GO1GGiXeXHE8CT/view) the ResNet-101 model (we only need it but you can also [download other networks](https://github.com/mapillary/inplace_abn) if you want to change it).
+Then, put the pretrained model in the `pretrained` folder.
+
+
+### Run!
+We provide different scripts to run the experiments (see `run` folder).
+In the following, we describe the basic structure of them.
+
+First, you should run the base step (or step 0).
+```
+exp --method FT --name FT --epochs 30 --lr 0.01 --batch_size 24
+```
+In this example, we are running the fine-tuning method (FT). For other methods (COS, SPN, DWI, RT) you can change the method name.
+WI and PIFS rely on the COS in the step 0, while FT, AMP, LWF, ILT, MIB rely on the FT one. 
+
+After this, you can run the incremental steps.
+There are a few options: (i) the task, (ii) the number of images (n_shot), and (iii) the sampling split (i_shot).
+
+i) The list of tasks is:
+```
+voc:
+    5-0, 5-1, 5-2, 5-3
+coco:
+    20-0, 20-1, 20-2, 20-3
+```
+For multi-step, you can append an `m` after the task (e.g., `5-0m`)
+
+ii) We tested 1, 2, and 5 shot. You can specify it with the `nshot` option.
+
+iii) We used three random sampling. You can specify it with the `ishot` option.
+
+The training will produce both an output on the terminal and it will log on tensorboard at the `logs/<Exp_Name>` folder.
+After the training, it will append a row in the csv file `logs/results/<dataset>/<task>.csv`.
+
+## Qualitative Results
+![qual-voc](https://raw.githubusercontent.com/fcdl94/FSS/master/images/qual_voc2.pdf)
+![qual-coco](https://raw.githubusercontent.com/fcdl94/FSS/master/images/qual_coco2.pdf)
+
+## Cite us!
+Please, cite the following article when referring to this code/method.
+```
+@InProceedings{cermelli2020prototype,
+  title={Prototype-based Incremental Few-Shot Semantic Segmentation },
+  author={Cermelli, Fabio and Mancini, Massimiliano and Xian, Yongqin and Akata, Zeynep and Caputo, Barbara},
+  booktitle={Proceedings of the 32nd British Machine Vision Conference},
+  month={November},
+  year={2021}
+}
+```
diff --git a/argparser.py b/argparser.py
@@ -16,10 +16,9 @@ def modify_command_options(opts):
     if opts.backbone is None:
         opts.backbone = 'resnet101'
 
-    if opts.method == "GIFS":
+    if opts.method == "PIFS":
         opts.method = "WI"
         opts.norm_act = "iabr"
-        # opts.l2_loss = 0.1 if opts.l2_loss == 0 else opts.l2_loss
         opts.loss_kd = 10
         opts.dist_warm_start = True
     elif opts.method == 'LWF':
@@ -38,7 +37,7 @@ def modify_command_options(opts):
         opts.train_only_novel = True
         opts.train_only_classifier = True
         opts.method = "FT"
-        opts.lr_cls = 1 # need to check!
+        opts.lr_cls = 1
     elif opts.method == 'AFHN' and opts.step > 0:
         opts.train_only_novel = True
         opts.train_only_classifier = True
@@ -66,15 +65,14 @@ def get_argparser():
                         help="random seed (default: 42)")
     parser.add_argument("--num_workers", type=int, default=2,
                         help='number of workers (default: 2)')
-    parser.add_argument('--opt_level', type=str, choices=['O0', 'O1', 'O2', 'O3'], default='O0')
     parser.add_argument("--device", type=int, default=None,
                         help='Specify the device you want to use.')
 
     # Dataset Options
     parser.add_argument("--data_root", type=str, default="data",
                         help="path to Dataset")
     parser.add_argument("--dataset", type=str, default='voc',
-                        choices=['voc', 'cts', 'coco', 'coco-stuff'], help='Name of dataset')
+                        choices=['voc', 'coco', 'coco-stuff'], help='Name of dataset')
 
     # Task Options
     parser.add_argument("--step", type=int, default=0,
@@ -162,10 +160,6 @@ def get_argparser():
                         help='Use this to enable last BN+ReLU on Deeplab-v3 (def. False)')
     parser.add_argument("--no_pooling", default=False, action='store_true',
                         help='Use this to DIS-enable Pooling in Deeplab-v3 (def. False)')
-    parser.add_argument("--hnm", default=False, action='store_true',
-                        help='Use this to enable Hard Negative Mining (def. False)')
-    parser.add_argument("--focal", default=False, action='store_true',
-                        help='Use this to enable Focal Loss (def. False)')
 
     # Test and Checkpoint options
     parser.add_argument("--test",  action='store_true', default=False,
@@ -179,6 +173,7 @@ def get_argparser():
     parser.add_argument("--cross_val", action='store_true', default=False,
                         help="If validate on training or on validation (default: Val)")
 
+    # Checkpoint to start in IL steps
     parser.add_argument("--step_ckpt", default=None, type=str,
                         help="path to trained model at previous step. Leave it None if you want to use def path")
 
@@ -188,6 +183,7 @@ def get_argparser():
     parser.add_argument("--embedding", type=str, default="fastnvec", choices=['word2vec', 'fasttext', 'fastnvec'])
     parser.add_argument("--amp_alpha", type=float, default=0.25,
                         help='Alpha value for the proxy adaptation.')
+    # parameters for IL methods
     parser.add_argument("--mib_ce", default=False, action='store_true',
                         help='Use the MiB classification loss (Def No)')
     parser.add_argument("--init_mib", default=False, action='store_true',
@@ -196,22 +192,17 @@ def get_argparser():
                         help='The MiB distillation loss strength (Def 0.)')
     parser.add_argument("--loss_kd", default=0, type=float,
                         help='The distillation loss strength (Def 0.)')
-    parser.add_argument("--ort_proto", default=0, type=float,
-                        help='The ORT*PROTO loss strength (Def 0.)')
     parser.add_argument("--kd_alpha", default=1, type=float,
-                        help='The temperature vale (Def 1.)')
+                        help='The temperature value of KD loss (Def 1.)')
+    # other distillation choices on features
     parser.add_argument("--l2_loss", default=0, type=float,
-                        help='The MSE feature loss strength (Def 0.)')
+                        help='The MSE feature (Deeplab-output) loss strength (Def 0.)')
     parser.add_argument("--loss_de", default=0, type=float,
-                        help='The MSE-body feature loss strength (Def 0.)')
+                        help='The MSE on body (resnet-output) feature loss strength (Def 0.)')
     parser.add_argument("--l1_loss", default=0, type=float,
                         help='The L1 feature loss strength (Def 0.)')
     parser.add_argument("--cos_loss", default=0, type=float,
-                        help='The feature loss strength (Def 0.)')
-    parser.add_argument("--bkg_dist", default=0, type=float,
-                        help='The feature loss strength (Def 0.)')
-    parser.add_argument("--kl_div", default=False, action='store_true',
-                        help='Use true KL loss and not the CE loss.')
+                        help='The Cosine distillation on feature loss strength (Def 0.)')
     parser.add_argument("--ckd", default=False, action='store_true',
                         help='Use cosine KD loss and not the CE loss.')
     parser.add_argument("--dist_warm_start", default=False, action='store_true',
@@ -225,42 +216,10 @@ def get_argparser():
                         help="Train only the classifier of current step (default: False)")
     parser.add_argument("--bn_momentum", default=None, type=float,
                         help="The BN momentum (Set to 0.1 to update of running stats of ABR.)")
-
+    # Parameters for DWI
     parser.add_argument("--dyn_lr", default=1., type=float,
                         help='LR for DynWI (Def 1)')
     parser.add_argument("--dyn_iter", default=1000, type=int,
                         help='Iterations for DynWI (Def 1000)')
 
-    parser.add_argument("--gen_acloss",  action='store_true', default=False,
-                        help='Use BKG loss for generation (Def False)')
-    parser.add_argument("--gen_lr", default=0.00001, type=float,
-                        help='LR for Generation (Def 1e-5)')
-    parser.add_argument("--gen_alpha", default=1., type=float,
-                        help='CrossEntropy Weight (Def 1)')
-    parser.add_argument("--gen_iter", default=10000, type=int,
-                        help='Iterations for Generation (Def 1e5)')
-    parser.add_argument("--gen_ncritic", default=5, type=int,
-                        help='Number of critic iterations (Def 5)')
-    parser.add_argument("--gen_pixtopix", action='store_true', default=False,
-                        help='Use PixToPix Generator')
-    parser.add_argument("--gen_fgpp", action='store_true', default=False,
-                        help='Use Feature Generator ++')
-    parser.add_argument("--gen_cond_gan", action='store_true', default=False,
-                        help='Use Conditional GAN Discriminator')
-    parser.add_argument("--gen_mib", action='store_true', default=False,
-                        help='Use MiB CrossEntropy on Generator.')
-    parser.add_argument("--gen_nlayer", default=0, type=int,
-                        help='Number of Res layers to use in generator.')
-    parser.add_argument("--ngf", default=64, type=int,
-                        help='Feature Generator Size (def 64)')
-    parser.add_argument("--type", default=3, type=int,
-                        help='Type of generator input.')
-    # to remove
-    parser.add_argument("--pixel_imprinting", action='store_true', default=False,
-                        help="Use only a pixel for imprinting when with WI (default: False)")
-    parser.add_argument("--weight_mix", action='store_true', default=False,
-                        help="When doing WI, sum to proto the mix of old weights (default: False)")
-
-
-
     return parser
diff --git a/coco.sh b/coco.sh
@@ -1,8 +1,8 @@
 #!/bin/bash
 
 export CUDA_VISIBLE_DEVICES=$1
-port=$2
-alias exp="python -m torch.distributed.launch --master_port ${port} --nproc_per_node=1 run.py --opt_level O0"
+port=$(python get_free_port.py)
+alias exp="python -m torch.distributed.launch --master_port ${port} --nproc_per_node=1 run.py"
 shopt -s expand_aliases
 
 task=$3  # 20-0, 20-1, 20-2, 20-3