- OneDiff Enterprise Installation Guide
- OneDiffx Installation Guide
- Before use, please confirm with
python3 -m oneflow --doctor
to confirm the argumententerprise: True
. If the information displayed is as followsenterprise: True
then it meets the requirements. If the information displayed is as followsenterprise: False
, then run the following commandpip uninstall oneflow onediff_quant -y
, and then follow the installation instructions for the Enterprise version to reinstall the OneDiff Enterprise version. You can find the relevant installation instructions through the following link: OneDiff Enterprise Installation Instructions
Notes:
- Specify the directory for saving graphs using
export COMFYUI_ONEDIFF_SAVE_GRAPH_DIR="/path/to/save/graphs"
. - When carrying out quantization for the first time, it is essential to analyze the data dependencies and identify the necessary parameters for quantification, such as the data's maximum and minimum values, which require additional computation time. Once these parameters are established and stored in cache, future quantization processes can directly utilize these parameters, thereby accelerating the processing speed. When quantization is performed a second time, the log file
*.pt
is cached. Information about the quantization results can be found incache_dir/quantization_stats.json
.
We compared the performance of the stable-diffusion-xl-base-1.0 model in the three conditions and listed them in the subscript.
Accelerator | Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|---|
NVIDIA GeForce RTX 3090 | 5.63 s | 3.38 s ( ~40.0%) | 2.60 s ( ~53.8%) |
The following table shows the workflows used separately:
Note that you can download all images in this page and then drag or load them on ComfyUI to get the workflow embedded in the image.
Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|
Model parameters can be referred to Parameter Description.
cd ComfyUI
wget -O models/checkpoints/sd_xl_base_1.0.safetensors https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
python main.py --gpu-only
We compared the performance of the stable-diffusion-v1-5 model in the three conditions and listed them in the subscript.
Accelerator | Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|---|
NVIDIA GeForce RTX 3090 | 3.54 s | 2.13 s ( ~39.8%) | 1.85 s ( ~47.7%) |
The following table shows the workflows used separately:
Note that you can download all images in this page and then drag or load them on ComfyUI to get the workflow embedded in the image.
Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|
Model parameters can be referred to Parameter Description.
cd ComfyUI
wget -O models/v1-5-pruned-emaonly.ckpt https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.ckpt
python main.py --gpu-only
We compared the performance of the stable-diffusion-xl-base-1.0 model in the three conditions and listed them in the subscript.
Accelerator | Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|---|
NVIDIA A800-SXM4-80GB | 35.54 s | 25.59 s (27.99 %) | 22.30 s (37.25 %) |
The following table shows the workflows used separately:
Note that you can download all images in this page and then drag or load them on ComfyUI to get the workflow embedded in the image.
Baseline (non-optimized) | OneDiff(optimized) | OneDiff Quant(optimized) |
---|---|---|
Model parameters can be referred to Parameter Description.
cd ComfyUI
wget -O models/checkpoints/sd_xl_base_1.0.safetensors https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
wget -O models/checkpoints/svd_xt_1_1.safetensors https://huggingface.co/vdo/stable-video-diffusion-img2vid-xt-1-1/resolve/main/svd_xt_1_1.safetensors
python main.py --gpu-only
Option | Range | Default | Description |
---|---|---|---|
quantized_conv_percentage | [0, 100] | 100 | Example value representing 100% quantization for linear layers |
quantized_linear_percentage | [0, 100] | 100 | Example value representing 100% quantization for convolutional layers |
conv_compute_density_threshold | [0, ∞) | 100 | Computational density threshold for quantizing convolutional modules to 100 |
linear_compute_density_threshold | [0, ∞) | 300 | Computational density threshold for quantizing linear modules to 300 |