Memory footprint

sarakaram · April 20, 2021, 1:56am

In tutorial it is mentioned that quantizations by 8bit can reduce the memory footprint by 4x. Can you please elaborate how to visualize this compression? By comparing the models provided in AIMET model zoo, for instance MobilNet, original ckpt takes about 80MB and quantized ckpt is also taking 80MB (and not 80MB/4) space. Thank you in advance.

akhobare · April 20, 2021, 3:27pm

Hi @sarakaram
Looking at the size of the checkpoint is not the right way to estimate memory footprint of the model on device.
The model parameters are tensors. In the original model these are FP32 tensors, each element in the tensor taking 32-bits of memory. In an INT8 quantized model, these same tensors are represented using 8-bit integer value, so each element of the tensor will take 8-bits. In general, on-device, the tensors will be stored in a packed form meaning INT8 tensor values don’t get padded up. Hopefully this explains to you the 4x reduction in model footprint on-device.

What runtime and target are you using to run your model? The runtime software should have some metrics that estimate/measure the memory footprint of the model.

Topic		Replies	Views
How to deploy quantized model to mobile device?	0	834	March 15, 2021
Loading and converting ssd_mobilenetv2 model from aimet_model_zoo to TFLite	2	477	June 20, 2023
Welcome to the AI Model Efficiency Toolkit Forum!	0	931	April 27, 2020
How to load the exported model?	2	1018	August 6, 2020
Compress Keras (TF) model	1	1046	July 22, 2021

Memory footprint

Related topics