Bert, Transformers

sarakaram · April 13, 2021, 10:02pm

Is AIMET able to quantize non-computer vision models like BERT and Transformers? Where they consist of non-conv layers but only dense layers instead?

akhobare · April 16, 2021, 2:37am

Are you using PyTorch models?
If yes, then we need to add some support. E.g. for RNN/LSTM layers, we recently added a custom Quantized implementation. This can be the template for supporting Transformer layers.

Any chance you are interested in contributing and adding this support to AIMET? We would love it.

sarakaram · April 16, 2021, 5:01pm

Hi @akhobare ,

Thanks for your response. I am using TF. BERT and transformers are not sequential models, they take the words embeddings simultaneously and process them mainly using FC layers (No Conv layers / RNN blocks present) . Can AIMET (TF) quantize the inputs and FC layers?

akhobare · April 20, 2021, 3:31pm

In general, AIMET TensorFlow should detect the FC layers and add quantization sim ops in the right places. But we have not tried with a BERT architecture.

You can give this a shot and please report back your findings. You can visualize the QuantizationSimModel, sim.session using TensorBoard to see where the quantization sim ops got added to the graph.

@quic_ssiddego - tagging you, so you can guide further based on what @sarakaram finds.

Topic		Replies	Views
Welcome to the AI Model Efficiency Toolkit Forum!	0	934	April 27, 2020
How to check whether the layers of model is Quantized?	1	887	May 28, 2021
AIMET & TFLITE optimizations	9	1531	May 29, 2020
DETR(Detection Transformer) AIMET Quantization	1	1011	July 26, 2021
Efficient-Net-B0 AIMET quantization	2	1159	July 28, 2021

Bert, Transformers

Related topics