Name	Name	Last commit message	Last commit date
parent directory ..
configs	configs
data	data
README.md	README.md
eval.py	eval.py
paddle_inference_eval.py	paddle_inference_eval.py
run.py	run.py

语义分割模型自动压缩示例

1.简介
2.Benchmark
3.开始自动压缩
4.预测部署
5.FAQ

1.简介

本示例将以语义分割模型PP-HumanSeg-Lite为例，介绍如何使用PaddleSeg中Inference部署模型进行自动压缩。本示例使用的自动压缩策略为非结构化稀疏、蒸馏和量化、蒸馏。

2.Benchmark

模型	策略	Total IoU	ARM CPU耗时(ms) thread=1	Nvidia GPU耗时(ms)	配置文件	Inference模型
PP-HumanSeg-Lite	Baseline	92.87	56.363	-	-	model
PP-HumanSeg-Lite	非结构化稀疏+蒸馏	92.35	37.712	-	config	-
PP-HumanSeg-Lite	量化+蒸馏	92.84	49.656	-	config	model (非最佳)
PP-Liteseg	Baseline	77.04	-	1.425	-	model
PP-Liteseg	量化训练	76.93	-	1.158	config	model
HRNet	Baseline	78.97	-	8.188	-	model
HRNet	量化训练	78.90	-	5.812	config	model
UNet	Baseline	65.00	-	15.291	-	model
UNet	量化训练	64.93	-	10.228	config	model
Deeplabv3-ResNet50	Baseline	79.90	-	12.766	-	model
Deeplabv3-ResNet50	量化训练	79.26	-	8.839	config	model
BiSeNetV2	Baseline	73.17	-	35.61	-	model
BiSeNetV2	量化训练	73.20	-	15.94	config	model

ARM CPU测试环境：高通骁龙710处理器(SDM710 2*A75(2.2GHz) 6*A55(1.7GHz))；
Nvidia GPU测试环境：
- 硬件：NVIDIA Tesla T4 单卡
- 软件：CUDA 11.0, cuDNN 8.0, TensorRT 8.0
- 测试配置：batch_size: 40

下面将以开源数据集为例介绍如何对PP-HumanSeg-Lite进行自动压缩。

3. 自动压缩流程

3.1 准备环境

PaddlePaddle >= 2.3 （可从Paddle官网下载安装）
PaddleSlim >= 2.3
PaddleSeg == 2.5.0

安装paddlepaddle：

# CPU
pip install paddlepaddle
# GPU
pip install paddlepaddle-gpu

安装paddleslim：

pip install paddleslim

准备paddleslim示例代码：

git clone https://github.com/PaddlePaddle/PaddleSlim.git

安装paddleseg 2.5.0

pip install paddleseg==2.5.0

注：安装PaddleSeg的目的只是为了直接使用PaddleSeg中的Dataloader组件，不涉及模型组网等。本示例需安装PaddleSeg 2.5.0, 不同版本的PaddleSeg的Dataloader返回数据的格式略有不同.

3.2 准备数据集

开发者可下载开源数据集 (如AISegment) 或自定义语义分割数据集。请参考PaddleSeg数据准备文档来检查、对齐数据格式即可。

本示例使用示例开源数据集 AISegment 数据集为例介绍如何对PP-HumanSeg-Lite进行自动压缩。示例数据集仅用于快速跑通自动压缩流程，并不能复现出 benckmark 表中的压缩效果。

可以通过以下命令下载人像分割示例数据:

cd PaddleSlim/example/auto_compression/semantic_segmentation
python ./data/download_data.py mini_humanseg
### 下载后的数据位置为 ./data/humanseg/

提示:

PP-HumanSeg-Lite压缩过程使用的数据集
- 数据集：AISegment + PP-HumanSeg14K + 内部自建数据集。其中 AISegment 是开源数据集，可从链接处获取；PP-HumanSeg14K 是 PaddleSeg 自建数据集，可从官方渠道获取；内部数据集不对外公开。
- 示例数据集: 用于快速跑通人像分割的压缩和推理流程, 不能用该数据集复现 benckmark 表中的压缩效果。下载链接
PP-Liteseg，HRNet，UNet，Deeplabv3-ResNet50数据集
- cityscapes: 请从cityscapes官网下载完整数据
- 示例数据集: cityscapes数据集的一个子集，用于快速跑通压缩和推理流程，不能用该数据集复现 benchmark 表中的压缩效果。下载链接

3.3 准备预测模型

预测模型的格式为：model.pdmodel 和 model.pdiparams两个，带pdmodel的是模型文件，带pdiparams后缀的是权重文件。

注：其他像__model__和__params__分别对应model.pdmodel 和 model.pdiparams文件。

如果想快速体验，可直接下载PP-HumanSeg-Lite 的预测模型：

wget https://bj.bcebos.com/v1/paddlemodels/PaddleSlim/analysis/ppseg_lite_portrait_398x224_with_softmax.tar.gz
tar -xzf ppseg_lite_portrait_398x224_with_softmax.tar.gz

也可进入PaddleSeg 中导出所需预测模型。

3.4 自动压缩并产出模型

自动压缩示例通过run.py脚本启动，会使用接口 paddleslim.auto_compression.AutoCompression 对模型进行自动压缩。首先要配置config文件中模型路径、数据集路径、蒸馏、量化、稀疏化和训练等部分的参数，配置完成后便可对模型进行非结构化稀疏、蒸馏和量化、蒸馏。

当只设置训练参数，并在config文件中 Global 配置中传入 deploy_hardware 字段时，将自动搜索压缩策略进行压缩。以骁龙710（SD710）为部署硬件，进行自动压缩的运行命令如下：

# 单卡启动
export CUDA_VISIBLE_DEVICES=0
python run.py --config_path='./configs/pp_humanseg/pp_humanseg_auto.yaml' --save_dir='./save_compressed_model'

# 多卡启动
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch run.py --config_path='./configs/pp_humanseg/pp_humanseg_auto.yaml' --save_dir='./save_compressed_model'

自行配置稀疏参数进行非结构化稀疏和蒸馏训练，配置参数含义详见自动压缩超参文档。具体命令如下所示：

# 单卡启动
export CUDA_VISIBLE_DEVICES=0
python run.py --config_path='./configs/pp_humanseg/pp_humanseg_sparse.yaml' --save_dir='./save_sparse_model'

# 多卡启动
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch run.py --config_path='./configs/pp_humanseg/pp_humanseg_sparse.yaml' --save_dir='./save_sparse_model'

自行配置量化参数进行量化和蒸馏训练，配置参数含义详见自动压缩超参文档。具体命令如下所示：

# 单卡启动
export CUDA_VISIBLE_DEVICES=0
python run.py --config_path='./configs/pp_humanseg/pp_humanseg_qat.yaml' --save_dir='./save_quant_model'

# 多卡启动
export CUDA_VISIBLE_DEVICES=0,1
python -m paddle.distributed.launch run.py --config_path='./configs/pp_humanseg/pp_humanseg_qat.yaml' --save_dir='./save_quant_model'

压缩完成后会在save_dir中产出压缩好的预测模型，可直接预测部署。

4.预测部署

4.1 Paddle Inference 验证性能

量化模型在GPU上可以使用TensorRT进行加速，在CPU上可以使用MKLDNN进行加速。

以下字段用于配置预测参数：

参数名	含义
model_path	inference 模型文件所在目录，该目录下需要有文件 .pdmodel 和 .pdiparams 两个文件
model_filename	inference_model_dir文件夹下的模型文件名称
params_filename	inference_model_dir文件夹下的参数文件名称
dataset	选择数据集的类型，可选：`human`, `cityscape`。
dataset_config	数据集配置的config
image_file	待测试单张图片的路径，如果设置image_file，则dataset_config将无效。
device	预测时的设备，可选：`CPU`, `GPU`。
use_trt	是否使用 TesorRT 预测引擎，在device为`GPU`时生效。
use_mkldnn	是否启用`MKL-DNN`加速库，注意`use_mkldnn`，在device为`CPU`时生效。
cpu_threads	CPU预测时，使用CPU线程数量，默认10
precision	预测时精度，可选：`fp32`, `fp16`, `int8`。

TensorRT预测：

环境配置：如果使用 TesorRT 预测引擎，需安装 WITH_TRT=ON 的Paddle，下载地址：Python预测库

准备好预测模型，并且修改dataset_config中数据集路径为正确的路径后，启动测试：

python paddle_inference_eval.py \
      --model_path=pp_liteseg_qat \
      --dataset='cityscape' \
      --dataset_config=configs/dataset/cityscapes_1024x512_scale1.0.yml \
      --use_trt=True \
      --precision=int8

MKLDNN预测：

python paddle_inference_eval.py \
      --model_path=pp_liteseg_qat \
      --dataset='cityscape' \
      --dataset_config=configs/dataset/cityscapes_1024x512_scale1.0.yml \
      --device=CPU \
      --use_mkldnn=True \
      --precision=int8 \
      --cpu_threads=10

4.2 Paddle Inference 测试单张图片

利用人像分割测试单张图片：

python paddle_inference_eval.py \
      --model_path=pp_humanseg_qat \
      --dataset='human' \
       --image_file=./data/human_demo.jpg \
      --use_trt=True \
      --precision=int8

原始图片
FP32推理结果
Int8推理结果

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semantic_segmentation

semantic_segmentation

README.md

语义分割模型自动压缩示例

1.简介

2.Benchmark

3. 自动压缩流程

3.1 准备环境

3.2 准备数据集

3.3 准备预测模型

3.4 自动压缩并产出模型

4.预测部署

4.1 Paddle Inference 验证性能

4.2 Paddle Inference 测试单张图片

4.3 更多部署教程

5.FAQ

Files

semantic_segmentation

Directory actions

More options

Directory actions

More options

Latest commit

History

semantic_segmentation

Folders and files

parent directory

README.md

语义分割模型自动压缩示例

1.简介

2.Benchmark

3. 自动压缩流程

3.1 准备环境

3.2 准备数据集

3.3 准备预测模型

3.4 自动压缩并产出模型

4.预测部署

4.1 Paddle Inference 验证性能

4.2 Paddle Inference 测试单张图片

4.3 更多部署教程

5.FAQ