Skip to content

Latest commit

 

History

History
130 lines (93 loc) · 5.02 KB

File metadata and controls

130 lines (93 loc) · 5.02 KB

简体中文 | English

MS-TCN : Video Action Segmentation Model


Contents

Introduction

Ms-tcn model is a classic model of video motion segmentation model, which was published on CVPR in 2019. We optimized the officially implemented pytorch code and obtained higher precision results in paddlevideo.


MS-TCN Overview

Data

MS-TCN can choose 50salads, breakfast, gtea as trianing set. Please refer to Video Action Segmentation dataset download and preparation doc Video Action Segmentation dataset

Train

After prepare dataset, we can run sprits.

# gtea dataset
export CUDA_VISIBLE_DEVICES=3
python3.7 main.py  --validate -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml --seed 1538574472
  • Start the training by using the above command line or script program. There is no need to use the pre training model. The video action segmentation model is usually a full convolution network. Due to the different lengths of videos, the DATASET.batch_size of the video action segmentation model is usually set to 1, that is, batch training is not required. At present, only single sample training is supported.

Test

Test MS-TCN on dataset scripts:

python main.py  --test -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml --weights=./output/MSTCN/MSTCN_split_1.pdparams
  • The specific implementation of the index is to calculate ACC, edit and F1 scores by referring to the test scriptevel.py provided by the author of ms-tcn.

  • The evaluation method of data set adopts the folding verification method in ms-tcn paper, and the division method of folding is the same as that in ms-tcn paper.

Accuracy on Breakfast dataset(4 folding verification):

Model Acc Edit [email protected] [email protected] [email protected]
paper 66.3% 61.7% 48.1% 48.1% 37.9%
paddle 65.2% 61.5% 53.7% 49.2% 38.8%

Accuracy on 50salads dataset(5 folding verification):

Model Acc Edit [email protected] [email protected] [email protected]
paper 80.7% 67.9% 76.3% 74.0% 64.5%
paddle 81.1% 71.5% 77.9% 75.5% 66.5%

Accuracy on gtea dataset(4 folding verification):

Model Acc Edit [email protected] [email protected] [email protected]
paper 79.2% 81.4% 87.5% 85.4% 74.6%
paddle 76.9% 81.8% 86.4% 84.7% 74.8%

Model weight for gtea

Test_Data [email protected] checkpoints
gtea_split1 70.2509 MSTCN_gtea_split_1.pdparams
gtea_split2 70.7224 MSTCN_gtea_split_2.pdparams
gtea_split3 80.0 MSTCN_gtea_split_3.pdparams
gtea_split4 78.1609 MSTCN_gtea_split_4.pdparams

Infer

export inference model

python3.7 tools/export_model.py -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml \
                                -p data/MSTCN_gtea_split_1.pdparams \
                                -o inference/MSTCN

To get model architecture file MSTCN.pdmodel and parameters file MSTCN.pdiparams, use:

infer

Input file are the file list for infering, for example:

S1_Cheese_C1.npy
S1_CofHoney_C1.npy
S1_Coffee_C1.npy
S1_Hotdog_C1.npy
...
python3.7 tools/predict.py --input_file data/gtea/splits/test.split1.bundle \
                           --config configs/segmentation/ms_tcn/ms_tcn_gtea.yaml \
                           --model_file inference/MSTCN/MSTCN.pdmodel \
                           --params_file inference/MSTCN/MSTCN.pdiparams \
                           --use_gpu=True \
                           --use_tensorrt=False

example of logs:

result write in : ./inference/infer_results/S1_Cheese_C1.txt
result write in : ./inference/infer_results/S1_CofHoney_C1.txt
result write in : ./inference/infer_results/S1_Coffee_C1.txt
result write in : ./inference/infer_results/S1_Hotdog_C1.txt
result write in : ./inference/infer_results/S1_Pealate_C1.txt
result write in : ./inference/infer_results/S1_Peanut_C1.txt
result write in : ./inference/infer_results/S1_Tea_C1.txt

Reference