The main goal of this repository is to rewrite the object detection pipeline with a better code structure for better portability and adaptability to apply new experimental methods. The object detection pipeline is based on Ultralytics YOLOv5.
- YOLOv5 based portable model (model built with kindle)
- Model conversion (TorchScript, ONNX, TensorRT) support
- Tensor decomposition model with pruning optimization
- Stochastic Weight Averaging(SWA) support
- Auto search for NMS parameter optimization
- W&B support with model save and load functionality
- Representative Learning (Experimental)
- Distillation via soft teacher method (Experimental)
- C++ inference (WIP)
- AutoML - searching efficient architecture for the given dataset(incoming!)
Install
- Conda virtual environment or docker is required to setup the environment
git clone https://github.com/j-marple-dev/AYolov2.git
cd AYolov2
./run_check.sh init
# Equivalent to
# conda env create -f environment.yml
# pre-commit install --hook-type pre-commit --hook-type pre-push
./run_docker.sh build
# You can add build options
# ./run_docker.sh build --no-cache
This will mount current repository directory from local disk to docker image
./run_docker.sh run
# You can add running options
# ./run_docker.sh run -v $DATASET_PATH:/home/user/dataset
./run_docker.sh exec
Train a model
-
Example
python3 train.py --model $MODEL_CONFIG_PATH --data $DATA_CONFIG_PATH --cfg $TRAIN_CONFIG_PATH # i.e. # python3 train.py --model res/configs/model/yolov5s.yaml --data res/configs/data/coco.yaml --cfg res/configs/cfg/train_config.yaml # Logging and upload trained weights to W&B # python3 train.py --model res/configs/model/yolov5s.yaml --wlog --wlog_name yolov5s
Prepare dataset
- Dataset config file
train_path: "DATASET_ROOT/images/train" val_path: "DATASET_ROOT/images/val" # Classes nc: 10 # number of classes dataset: "DATASET_NAME" names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light'] # class names
- Dataset directory structure
- One of
labels
orsegments
directory must exist. - Training label type(
labels
orsegments
) will be specified in the training config. - images and labels or segments must have a matching filename with .txt extension.
- One of
DATASET_ROOT │ ├── images │ ├── train │ └── val ├── labels │ ├── train │ └── val ├── segments │ ├── train │ └── val
Training config
- Default training configurations are defined in train_config.yaml.
- You may want to change
batch_size
,epochs
,device
,workers
,label_type
along with your model, dataset, and training hardware. - Be cautious to change other parameters. It may affect training results.
Model config
- Model is defined by yaml file with kindle
- Please refer to https://github.com/JeiKeiLim/kindle
Multi-GPU training
- Please use torch.distributed.run module for multi-GPU Training
python3 -m torch.distributed.run --nproc_per_node $N_GPU train.py --model $MODEL_CONFIG_PATH --data $DATA_CONFIG_PATH --cfg $TRAIN_CONFIG_PATH
- N_GPU: Number of GPU to use
Run a model validation
- Validate from local weights
python3 val.py --weights $WEIGHT_PATH --data-cfg $DATA_CONFIG_PATH
- You can pass W&B path to the
weights
argument.
python3 val.py --weights j-marple/AYolov2/179awdd1 --data-cfg $DATA_CONFIG_PATH
- TTA (Test Time Augmentation)
- Simply pass
--tta
argument with--tta-cfg
path - Default TTA configs are located in
res/configs/cfg/tta.yaml
- Simply pass
python3 val.py --weights $WEIGHT_PATH --data-cfg $DATA_CONFIG_PATH --tta --tta-cfg $TTA_CFG_PATH
- Validate with pycocotools (Only for COCO val2017 images)
Future work: The
val.py
andval2.py
should be merged together.
python3 val2.py --weights $WEIGHT_PATH --data $VAL_IMAGE_PATH --json-path $JSON_FILE_PATH
Name | W&B URL | img_size | mAPval 0.5:0.95 |
mAPval 0.5 |
params |
---|---|---|---|---|---|
YOLOv5s | j-marple/AYolov2/33cxs5tn | 640 | 38.2 | 57.5 | 7,235,389 |
YOLOv5m | j-marple/AYolov2/2ktlek75 | 640 | 45.0 | 63.9 | 21,190,557 |
YOLOv5l decomposed | j-marple/AYolov2/30t7wh1x | 640 | 46.9 | 65.6 | 26,855,105 |
YOLOv5l | j-marple/AYolov2/1beuv3fd | 640 | 48.0 | 66.6 | 46,563,709 |
YOLOv5x decomposed | j-marple/AYolov2/1gxaqgk4 | 640 | 49.2 | 67.6 | 51,512,570 |
YOLOv5x | j-marple/AYolov2/1gxaqgk4 | 640 | 49.6 | 68.1 | 86,749,405 |
Export model to TorchScript, ONNX, TensorRT
-
You can export a trained model to TorchScript, ONNX, or TensorRT
-
INT8 quantization is currently not supported (coming soon).
-
Usage
python3 export.py --weights $WEIGHT_PATH --type [torchscript, ts, onnx, tensorrt, trt] --dtype [fp32, fp16, int8]
-
Above command will generate both model and model config file.
- Example) FP16, Batch size 8, Image size 640x640, TensorRT
- model_fp16_8_640_640.trt
- model_fp16_8_640_640_trt.yaml
batch_size: 8 conf_t: 0.001 # NMS confidence threshold dst: exp/ # Model location dtype: fp16 # Data type gpu_mem: 6 # GPU memory restriction img_height: 640 img_width: 640 iou_t: 0.65 # NMS IoU threshold keep_top_k: 100 # NMS top k parameter model_cfg: res/configs/model/yolov5x.yaml # Base model config location opset: 11 # ONNX opset version rect: false # Rectangular inference mode stride_size: 32 # Model stride size top_k: 512 # Pre-NMS top k parameter type: trt # Model type verbose: 1 # Verbosity level weights: ./exp/yolov5x.pt # Base model weight file location
- Example) FP16, Batch size 8, Image size 640x640, TensorRT
-
Once, model has been exported, you can run val.py with the exported model.
ONNX
inference is currently not supported.
python3 val.py --weights model_fp16_8_640_640.trt --data-cfg $DATA_CONFIG_PATH
Applying tensor decomposition
-
A trained model can be compressed via tensor decomposition.
-
Decomposed conv is composed of 3 convolutions from 1 large convolution.
- Example)
- Original conv: 64x128x3x3
- Decomposed conv: 64x32x1x1 -> 32x16x3x3 -> 16x128x1x1
- Example)
-
Usage
python3 decompose_model.py --weights $WEIGHT_PATH --loss-thr $DECOMPOSE_LOSS_THRESHOLD --prune-step $PRUNING_STEP --data-cfg $DATA_CONFIG_PATH
... [ Original] # param: 86,749,405, mAP0.5: 0.678784398716757, Speed(pre-process, inference, NMS): 0.030, 21.180, 4.223 [Decomposed] # param: 49,508,630, mAP0.5: 0.6707606125947304, Speed(pre-process, inference, NMS): 0.030, 20.274, 4.345 Decomposition config saved to exp/decompose/val/2021_0000_runs/args.yaml Decomposed model saved to exp/decompose/val/2021_0000_runs/yolov5x_decomposed.pt
- Passing
prune-step
to 0 will skip pruning optimization.
- Passing
- Pass random tensor x to original conv (ŷ) and decomposed conv (ỹ)
- Compute E = Error(ŷ, ỹ)
- If E < loss-thr, use decomposed conv
- Apply pruning ratio with binary search
- Jump to 1 until differential of pruning ratio is less than prune-step
:: Note :: Decomposition process uses CPU only.
Knowledge distillation
- An ad-hoc implementation of the knowledge distillation motivated from the method in "End-to-end semi-supervised object dection with soft teacher".
- Create pseudo-labels for "unlabeled dataset" using the inference of the "teacher" model.
- :: Note ::
- Implemented to use the same dataset for the "training dataset" and the "unlabeled dataset". To use different datasets, the creation of the dataloader instance
unlabeled_loader
indistillation.py
should be modified. - Teacher model weights are fixed during training student model. (In the original paper, teacher model is updated using "exponential moving averaging" the student model.)
- Implemented to use the same dataset for the "training dataset" and the "unlabeled dataset". To use different datasets, the creation of the dataloader instance
- Usage
python distillation.py --model res/configs/model/yolov5s.yaml \ --cfg res/configs/cfg/distillation.yam \ --data res/configs/data/coco.yaml \ --teacher {wandb_runpath_of_pretrained_model}
Representation learning
-
Representations of a model can be automatically discovered from raw data by representation learning.
-
You can apply SimpleRL or SimCLR to find better representations of the model with
--rl-type
option.- SimpleRL is a method to minimize a difference between last two representations of a model with L1 loss.
- SimCLR is a simple framework for contrastive self-supervised learning of visual representations without requiring specialized architectures.
-
Usage (default: base)
- SimpleRL
python train_repr.py --model res/configs/model/yolov5s_repr.yaml \ --data res/configs/data/coco_repr.yaml \ --cfg res/configs/cfg/train_config_repr.yaml \ --rl-type base
- SimCLR
python train_repr.py --model res/configs/model/simclr.yaml \ --data res/configs/data/coco_repr.yaml \ --cfg res/configs/cfg/train_config_simclr.yaml \ --rl-type simclr
- SimpleRL
Auto search for NMS parameters
If want to optimize NMS parameters(IoU threshold and confidence threshold), there are two ways to optimize.
- There is an issue with YOLOv5 validation.
- It's ok with training or validating but the validation results are little different.
- Optimize parameters with YOLOv5 validation.
- Optimize parameters with COCO validation (pycocotools).
python3 val_optimizer.py --weights ${WEIGHT_PATH | WANDB_PATH} --data-cfg $DATA_CONFIG_PATH
python3 val_optimizer.py --weights ${WEIGHT_PATH | WANDB_PATH} --data-cfg $DATA_CONFIG_PATH --run-json --json-path $JSON_FILE_PATH
The --json-path
is optional.
- If you have baseline network, give
--base-map50
and--base-time
arguments which are used for objective function. - To avoid the optimized parameters overfits, use
--n-skip
option to skip some images.
Applying SWA(Stochastic Weight Averaging)
There are three steps to apply SWA (Stochastic Weight Averaging):
- Fine-tune pre-trained model
- Create SWA model
- Test SWA model
$ python train.py --model yolov5l_kindle.pt \
--data res/configs/data/coco.yaml \
--cfg res/configs/cfg/finetune.yaml \
--wlog --wlog_name yolov5l_swa \
--use_swa
$ python create_swa_model.py --model_dir exp/train/2021_1104_runs/weights \
--swa_model_name swa_best5.pt \
--best_num 5
$ python create_swa_model.py --help
usage: create_swa_model.py [-h] --model_dir MODEL_DIR
[--swa_model_name SWA_MODEL_NAME]
[--best_num BEST_NUM]
optional arguments:
-h, --help show this help message and exit
--model_dir MODEL_DIR
directory of trained models to apply SWA (default: )
--swa_model_name SWA_MODEL_NAME
file name of SWA model (default: swa.pt)
--best_num BEST_NUM the number of trained models to apply SWA (default: 5)
$ python val.py --weights exp/train/2021_1104_runs/weights/swa_best5.pt \
--model-cfg '' \
--data-cfg res/configs/data/coco.yaml \
--conf-t 0.1 --iou-t 0.2
[1] Ultralytics YOLOv5 - https://github.com/ultralytics/yolov5
[2] YOLOR implementation - https://github.com/WongKinYiu/yolor.git
[3] MobileViT implementation - https://github.com/chinhsuanwu/mobilevit-pytorch
[4] Kindle - Making a PyTorch model easier than ever! - https://github.com/JeiKeiLim/kindle
[5] Wang, Chien-Yao, I-Hau Yeh, and Hong-Yuan Mark Liao. "You Only Learn One Representation: Unified Network for Multiple Tasks." arXiv preprint arXiv:2105.04206 (2021).
[6] Mehta, Sachin, and Mohammad Rastegari. "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer." arXiv preprint arXiv:2110.02178 (2021).
[7] Ghiasi, Golnaz, et al. "Simple copy-paste is a strong data augmentation method for instance segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[8] SWA Object Detection implementation - https://github.com/hyz-xmaster/swa_object_detection
[9] Izmailov, Pavel, et al. "Averaging weights leads to wider optima and better generalization." arXiv preprint arXiv:1803.05407 (2018).
[10] Zhang, Haoyang, et al. "Swa object detection." arXiv preprint arXiv:2012.12645 (2020).
[11] Xu, Mengde, et al. "End-to-End Semi-Supervised Object Detection with Soft Teacher." arXiv preprint arXiv:2106.09018 (2021).
[12] He, Kaiming, et al. "Momentum contrast for unsupervised visual representation learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[13] Chen, Ting, et al. "A simple framework for contrastive learning of visual representations." International conference on machine learning. PMLR, 2020.
[14] Grill, Jean-Bastien, et al. "Bootstrap your own latent: A new approach to self-supervised learning." arXiv preprint arXiv:2006.07733 (2020).
[15] Roh, Byungseok, et al. "Spatially consistent representation learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[16] PyTorch tensor decompositions - https://github.com/jacobgil/pytorch-tensor-decompositions
[17] PyTorch pruning tutorial - https://pytorch.org/tutorials/intermediate/pruning_tutorial.html
[18] Bengio, Yoshua et al. "Representation Learning: A Review and New Perspectives." IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013.
[19] Chen, Ting et al. "A Simple Framework for Contrastive Learning of Visual Representations." Proceedings of the 37th International Conference on Machine Learning. 2020
[20] Batched NMS - https://github.com/ultralytics/yolov3/blob/f915bf175c02911a1f40fbd2de8494963d4e7914/utils/utils.py#L562-L563
[21] Fast NMS - https://github.com/ultralytics/yolov3/blob/77e6bdd3c1ea410b25c407fef1df1dab98f9c27b/utils/utils.py#L557-L559
[22] Matrix NMS - ultralytics/yolov3#679 (comment)
[23] Merge NMS - https://github.com/ultralytics/yolov5/blob/master/utils/general.py#L710-L722
[24] Cluster NMS - https://github.com/Zzh-tju/yolov5/blob/master/utils/general.py#L689-L774
Thanks goes to these wonderful people (emoji key):
Jongkuk Lim 💻 |
Haneol Kim 💻 |
Hyungseok Shin 💻 |
Hyunwook Kim 💻 |
This project follows the all-contributors specification. Contributions of any kind are welcome!