Replies: 3 comments 5 replies
-
In https://www.tensorflow.org/lite/performance/model_optimization, there are some guide on how to use fp16, int16 quantize tflite models. But all examples keep the input / output as float32 or uint8 and do quantize parameters INSIDE a model. I've never seen 16bit input / output models yet, but it could be appeared in the future. |
Beta Was this translation helpful? Give feedback.
-
I think that fp16 quantization is relatively easier to convert than 8-bit integer quantization, so it will be a necessary feature . I have a fp16 quantize model. but it was failed to run.
Can I use fp16 quantize model with build option -DFLOAT16_SUPPORT now? |
Beta Was this translation helpful? Give feedback.
-
Anyway. FP16 support is merged. |
Beta Was this translation helpful? Give feedback.
-
Q1: should we support it or not?
(IF Q1==yes)
Q2: should this be "binary16"? (arbitrary binary value that we do NOT care its semantics) or should we treat it as floating point numbers?
If Q2 == float, it means that potentially, tensor_transform will do some arithmetics on it.
If Q2 == binary, it means that tensor_transform won't be able to touch it.
(IF Q2==float)
Q3: Is there a general consensus on its format that we can follow?
Beta Was this translation helpful? Give feedback.
All reactions