AI Model Efficiency Toolkit (AIMET) Forum

Compress Keras (TF) model

Dear AIMET Team,

I have studied all of your resources and can’t find the clear instructions on how to compress the model, built by Keras with TensorFlow backend. The docs posted here (Using AIMET with Keras Model — AI Model Efficiency Toolkit Documentation: ver tf-gpu_1.16.0) do not provide precise instructions. The description is really vague and it confuses me wether I have to retrain the model that I already have using aimet methods and then compress it or can I use the trained model I already have??

To be clear:

I have set up all the necessary environment on my Ubuntu 18.4.5 machine.
Python 3.6.9.
TF 1.15
AIMET 1.14

I have created and trained a TF model using Keras like so:

# (i) define compiler func
def build_and_compile_model():
    model = keras.Sequential([
        layers.Dense(4, activation='relu'),
        layers.Dense(8, activation='relu'),
        layers.Dense(4, activation='relu'),
        layers.Dense(3, activation='relu'),
        layers.Dense(2, activation='relu'),
    return model

# (ii) compile
model = build_and_compile_model()

# (iii) train
history =
    train_features, train_labels,
    verbose=2, epochs=500)

# (iv) evaluate model on validation data
val_predictions = model.predict(val_features).flatten()
print(f'MAE: {mae(val_labels, val_predictions)}')

# (v) save'tf_keras_model')

Now that I have the trained model what are the exact steps that I have to undertake to:

  1. Compress this model
  2. Evaluate the compressed model on the same validation data that I used to evaluate the uncompressed model in the step (iv) above

Hi Danil,

For compression and evaluation purpose, you can refer to the convert_tf_session_to_keras_model() example. The following steps are needed:

  1. Extract the tf backend session from your keras model
  2. Compress the model, more instructions are found at AIMET TensorFlow Compression API — AI Model Efficiency Toolkit Documentation: ver tf-gpu_1.16.0
  3. Save the backend session using save_tf_session_single_gpu() API and load it back via load_tf_sess_variables_to_keras_single_gpu(). These two steps are needed, you can’t use the original keras model to do evaluation. These two steps will create a new subclassed keras model based on the compressed session. AIMET is applying on tf sessions so the conversion is necessary
  4. Evaluate your model and compare to pre-compression metrics, make sure you recompile the new subclassed keras model before evaluation

The example also trains the model on both single-gpu and multi-gpu. You can ignore the training steps if you only want to compress and evaluate your model.

1 Like