Can not inference the quantilized model in my device by int8 #15

deepfind · 2021-09-26T08:37:55Z

👉 Please follow one of these issue templates 👈

Note: to keep the backlog clean and actionable, issues may be immediately closed if they do not follow one of the above issue templates.
I want to test the accuracy and time consumption of ibert in int8 inference, so I installed transformers and tried to quantize the Roberta-base model to generate the weights. I have set that quant_model is true and torch type is int8. However, the time consumption of ibert in int8 inference is similar to Roberta-base model in my 1080ti device. Is any problem with my config.json or my device? Here is my config.json:
{
"_name_or_path": "./outputs/checkpoint-1150/",
"architectures": [
"IBertForSequenceClassification"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"finetuning_task": "mrpc",
"force_dequant": "none",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"id2label": {
"0": 0,
"1": 1
},
"initializer_range": 0.02,
"intermediate_size": 3072,
"label2id": {
"0": 0,
"1": 1
},
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "ibert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 1,
"position_embedding_type": "absolute",
"quant_mode": true,
"tokenizer_class": "RobertaTokenizer",
"torch_dtype": "int8",
"transformers_version": "4.11.0.dev0",
"type_vocab_size": 1,
"vocab_size": 50265
}

NASUAS · 2024-05-08T02:25:27Z

Can you tell me how you test the accuracy and time consumption of ibert in int8 inference? Would you mind sharing your test code?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not inference the quantilized model in my device by int8 #15

Can not inference the quantilized model in my device by int8 #15

deepfind commented Sep 26, 2021

NASUAS commented May 8, 2024

Can not inference the quantilized model in my device by int8 #15

Can not inference the quantilized model in my device by int8 #15

Comments

deepfind commented Sep 26, 2021

👉 Please follow one of these issue templates 👈

NASUAS commented May 8, 2024