简体中文 | English
Ms-tcn model is a classic model of video motion segmentation model, which was published on CVPR in 2019. We optimized the officially implemented pytorch code and obtained higher precision results in paddlevideo.
MS-TCN can choose 50salads, breakfast, gtea as trianing set. Please refer to Video Action Segmentation dataset download and preparation doc Video Action Segmentation dataset
After prepare dataset, we can run sprits.
# gtea dataset
export CUDA_VISIBLE_DEVICES=3
python3.7 main.py --validate -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml --seed 1538574472
- Start the training by using the above command line or script program. There is no need to use the pre training model. The video action segmentation model is usually a full convolution network. Due to the different lengths of videos, the
DATASET.batch_size
of the video action segmentation model is usually set to1
, that is, batch training is not required. At present, only single sample training is supported.
Test MS-TCN on dataset scripts:
python main.py --test -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml --weights=./output/MSTCN/MSTCN_split_1.pdparams
-
The specific implementation of the index is to calculate ACC, edit and F1 scores by referring to the test scriptevel.py provided by the author of ms-tcn.
-
The evaluation method of data set adopts the folding verification method in ms-tcn paper, and the division method of folding is the same as that in ms-tcn paper.
Accuracy on Breakfast dataset(4 folding verification):
Model | Acc | Edit | [email protected] | [email protected] | [email protected] |
---|---|---|---|---|---|
paper | 66.3% | 61.7% | 48.1% | 48.1% | 37.9% |
paddle | 65.2% | 61.5% | 53.7% | 49.2% | 38.8% |
Accuracy on 50salads dataset(5 folding verification):
Model | Acc | Edit | [email protected] | [email protected] | [email protected] |
---|---|---|---|---|---|
paper | 80.7% | 67.9% | 76.3% | 74.0% | 64.5% |
paddle | 81.1% | 71.5% | 77.9% | 75.5% | 66.5% |
Accuracy on gtea dataset(4 folding verification):
Model | Acc | Edit | [email protected] | [email protected] | [email protected] |
---|---|---|---|---|---|
paper | 79.2% | 81.4% | 87.5% | 85.4% | 74.6% |
paddle | 76.9% | 81.8% | 86.4% | 84.7% | 74.8% |
Model weight for gtea
Test_Data | [email protected] | checkpoints |
---|---|---|
gtea_split1 | 70.2509 | MSTCN_gtea_split_1.pdparams |
gtea_split2 | 70.7224 | MSTCN_gtea_split_2.pdparams |
gtea_split3 | 80.0 | MSTCN_gtea_split_3.pdparams |
gtea_split4 | 78.1609 | MSTCN_gtea_split_4.pdparams |
python3.7 tools/export_model.py -c configs/segmentation/ms_tcn/ms_tcn_gtea.yaml \
-p data/MSTCN_gtea_split_1.pdparams \
-o inference/MSTCN
To get model architecture file MSTCN.pdmodel
and parameters file MSTCN.pdiparams
, use:
- Args usage please refer to Model Inference.
Input file are the file list for infering, for example:
S1_Cheese_C1.npy
S1_CofHoney_C1.npy
S1_Coffee_C1.npy
S1_Hotdog_C1.npy
...
python3.7 tools/predict.py --input_file data/gtea/splits/test.split1.bundle \
--config configs/segmentation/ms_tcn/ms_tcn_gtea.yaml \
--model_file inference/MSTCN/MSTCN.pdmodel \
--params_file inference/MSTCN/MSTCN.pdiparams \
--use_gpu=True \
--use_tensorrt=False
example of logs:
result write in : ./inference/infer_results/S1_Cheese_C1.txt
result write in : ./inference/infer_results/S1_CofHoney_C1.txt
result write in : ./inference/infer_results/S1_Coffee_C1.txt
result write in : ./inference/infer_results/S1_Hotdog_C1.txt
result write in : ./inference/infer_results/S1_Pealate_C1.txt
result write in : ./inference/infer_results/S1_Peanut_C1.txt
result write in : ./inference/infer_results/S1_Tea_C1.txt
- MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation, Y. Abu Farha and J. Gall.