- How do the AIMet compression techniques compare to the tflite ones as outlined here
(https://www.tensorflow.org/lite/performance/model_optimization). Can these two (AIMet and TF-lite techniques) be used in conjunction? Are the gains additive esp when final model has to be deployed as tflite?
- Can AIMet installation be done in google colab?
Thanks for the question.
AIMET includes structured pruning compression techniques like Spatial SVD and Channel Pruning. These techniques compress the model to reduce inference latency on-target. Looking at the link you pointed to, TF Model Optimization seems to include unstructured pruning. So, the compression techniques in AIMET don’t overlap with the ones in TF Model Optimization library, in my opinion. Furthermore, you should be able to take the resulting model after AIMET compression and run it via the TFLite inference runtime. We have not tried this out yet but conceptually this should work. If you can try and let us know, we will appreciate that.
Should be possible, though we have not tried this out. You will need to install all the required libraries. Follow the instructions in the installation guide and let us know if you encounter any issues. Also, if and when you get this to work, we would love it if you could contribute the recipe back to AIMET.
We have not tried this out yet but conceptually this should work. If you can try and let us know, we will appreciate that
Thanks for your response.
We will try that out and let you know.
Thanks for your response. We have got this to work in colab. We will share the recipe after verification.
We are looking to do the following test for verification of generated python libraries: we have a keras classification model. We want to compress it and test before-and-after with same 10 input images. Let us know If you can provide some pointers and guidelines towards this verification.
Thanks @gmacintern1. Appreciate your help.
For Keras models, we have some pointers. Please see https://github.com/quic/aimet/blob/develop/Docs/api_docs/convert_tf_sess_to_keras.rst
You can generate the corresponding html by running ‘cmake; make doc’ once you have a build environment.
After that, I would start with the Spatial SVD compression scheme. You can find the APIs for that here
Let me know if you have more questions.
Thanks for the pointers. Using the instructions in the links above and the “spatial_svd_auto_model” example, we are able to evaluate our keras model from within spatial_svd_auto_model() test function. The framework is generating options for various compression ratios and evaluating the generated session/graph. However the results are not very encouraging - the accuracy of our model drops from ~90% to ~55% even for target compression ratios between 0.5 and 0.95.
Can you provide some pointers to improve the accuracy further ?
After applying AIMET compression, did you try fine-tuning the model? This would involve training the model for a few more epochs. General guideline would be to resume from the same learning rate that you ended up with at the end of training. But you might need to search the hyperparameters to get optimal results.
Also, could you share what kind of model you are using?
Thanks. Did not fine tune the model, will try that next. Can you please point us (to some code examples) on how to invoke per layer fine-tuning from within the framework?
It is a simple tiny model - customized visualwakeword model based on Resnet. Let us know if you need more details.