why fp8_e4m3 min_scaling_factor divide 512? #3590

HPC4AI · 2025-01-20T10:03:21Z

Hello, I would like to perform quantization from the FP16 data type to the FP8E4M3 data type. I referred to the method described in the link https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/experimental/gen_ai/src/quantize/quantize.cu#L629, but I have a question. Why is the calculation of min_scaling_factor done by dividing by (FP8_E4M3_MAX::value * 512.f)? Could you please explain the basis for choosing 512.f? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why fp8_e4m3 min_scaling_factor divide 512? #3590

why fp8_e4m3 min_scaling_factor divide 512? #3590

HPC4AI commented Jan 20, 2025

why fp8_e4m3 min_scaling_factor divide 512? #3590

why fp8_e4m3 min_scaling_factor divide 512? #3590

Comments

HPC4AI commented Jan 20, 2025