A repo for Pruner
- This repo is based on a fork of Tenset.
-
Build and install this repo following the guide.
-
Version information can refer to here.
-
Register device abstraction on Pruner as abstraction
Quick tutorial on using Pruner on NVIDIA A100 refer to tutorial.
- Search with Pruner (tuning 2,000 trials)
python3 tune_network.py --network resnet_50 --n-trials 2000 --cost-model pam --target "cuda --model=a100" --psa a100_40
- Search with the MoA-Pruner (tuning 2,000 trials)
python3 tune_network.py --network resnet_50 --n-trials 2000 --cost-model pam-siamese-update --load-model pam_k80_1500.pkl --target "cuda --model=a100" --psa a100_40
- Search with Ansor (tuning 2,000 trials)
# Using TensetMLP code
python3 tune_network.py --network resnet_50 --n-trials 2000 --cost-model mlp --target "cuda --model=a100"
-
Pretrain or Finetune a model refer to tutorial
-
Search with the Pruner w/ finetuned model (tuning 2,000 trials)
python3 tune_network.py --network resnet_50 --n-trials 2000 --cost-model pam-no-update --load-model pam_finetune.pkl --target "cuda --model=a100" --psa a100_40
- Search with the TensetMLP (tuning 2,000 trials)
# Using TensetMLP code
python3 tune_network.py --network resnet_50 --n-trials 2000 --cost-model mlp-no-update --load-model mlp_finetune.pkl --target "cuda --model=a100"
method (tuning 2,000 trails) | ansor | Pruner | MoA-Pruner | TensetMLP | Pruner w/ finetuned model |
---|---|---|---|---|---|
Update mode | on-line | online | online | offline | offline |
Search time(s) | 6,691 | 5,563 | 4,978 | 5,469 | 4,212 |
Estimated total latency(ms) | 1.592 | 1.476 | 1.457 | 1.621 | 1.469 |
Note: Details are reported in './scripts/res/resnet_50'
- The Resnet-50's tuning curve with different method is shown as follows.
-
Detiled tuning results refer to E2E_Tuning_Comparison
-
The following figure shows that the search time required for Pruner to reach the performance of different approach tuning 2,000 trials on A100.