Post quantization does not utilize GPU #454

ek9852 · 2020-07-08T21:38:04Z

System information

TensorFlow version (you are using): 2.2.0
Are you willing to contribute it (Yes/No): No

Motivation
During post quantization , the GPU is idle (confirmed via nvidia-smi ), i.e. the post quantization is not using GPU to speed things up. It is very slow. It takes > 60 min to run on a server grade xeon (for test set of 2336 on our model):

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_dataset_gen():
  with tf.io.gfile.GFile(test_set, 'r') as f:
    test_list = f.readlines()
  for i in test_list:
    # Get sample input data as a numpy array
    with Image.open(os.path.join(datasetdir,  i).split()[0]) as img:
        yield [np.array(img).reshape(1,120,160,1).astype(np.float32)/255.0]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8  # or tf.int8
converter.inference_output_type = tf.uint8  # or tf.int8
tflite_quant_model = converter.convert()

Describe the feature
post quantization should utilize GPU to speed things up.

The text was updated successfully, but these errors were encountered:

miaout17 · 2020-07-09T01:23:14Z

A few initial questions:

For "test set of 2336 on our model", does it mean 2336 images are used as representative dataset?
Do you know how much time does it take to invoke the model?

I don't think the post training quantization tool supports GPU but I'm not the expert.
I'll let @suharshs follow from here.

ek9852 · 2020-07-09T02:34:09Z

For "test set of 2336 on our model", does it mean 2336 images are used as representative dataset?
Yes

Do you know how much time does it take to invoke the model?
0.3 sec on a Google Coral edge tpu. It should be much faster on my TITAN X nvidia gpu,
But post-quantization does not use GPU currently.

suharshs · 2020-07-09T05:08:29Z

TensorFlow Lite doesn't currently support non-mobile GPU kernels, and the post-training quantization tool is specific to TensorFlow Lite at the moment. As we work to unify TensorFlow and TensorFlow Lite we will keep this in mind. I will keep this issue open to give you updates as they come.

Thanks!

ek9852 added the feature request feature request label Jul 8, 2020

miaout17 assigned suharshs Jul 9, 2020

alanchiao unassigned suharshs Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post quantization does not utilize GPU #454

Post quantization does not utilize GPU #454

ek9852 commented Jul 8, 2020

miaout17 commented Jul 9, 2020

ek9852 commented Jul 9, 2020

suharshs commented Jul 9, 2020

Post quantization does not utilize GPU #454

Post quantization does not utilize GPU #454

Comments

ek9852 commented Jul 8, 2020

miaout17 commented Jul 9, 2020

ek9852 commented Jul 9, 2020

suharshs commented Jul 9, 2020