Fine tuning with tensorflow

cvipdnn · April 2, 2021, 1:35pm

After running fine tuning , but the accuracy became worse after training. I don’t know why.
My network is very simple, just one layer, the training set is mnist. After training, the weight didn’t change, but the bias changed significantly.

this is the model

inputs = Input((IMG_SIZEX, IMG_SIZEY))
x = Flatten()(inputs)
output = Dense(10, activation = ‘softmax’)(x)
model = Model(inputs, output)
optimizer = Adam(lr=0.001)
loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=False)
model.compile(optimizer = optimizer,
loss = loss_fn, metrics=[‘accuracy’])

2)The fine tuning code is below.

def training_helper(sim, generator):
g = sim.session.graph
sess = sim.session
with g.as_default():
x = sim.session.graph.get_tensor_by_name(“input_2:0”)
y = g.get_tensor_by_name(“dense_1/Softmax:0”)
#fc1_w = g.get_tensor_by_name(“/ReadVariableOp:0”)

    ce = g.get_tensor_by_name("loss/dense_1_loss/softmax_cross_entropy_with_logits/Reshape:0")
    # Using Adam optimizer
    train_step = tf.compat.v1.train.AdamOptimizer(1e-3, name="TempAdam").minimize(ce)
    graph_eval.initialize_uninitialized_vars(sess)
    # Input data for MNIST
    # Using 100 iterations and batch of size 100
    for j in range(10):
        for i in range(600):
            inputdata= x_train[i*100:i*100+100,:,:]
            outputdata = y_train[i*100:i*100+100]
            outputdata = tf.keras.utils.to_categorical(outputdata, 10)

            sess.run(train_step, feed_dict={x: inputdata, y:outputdata})

akhobare · April 2, 2021, 3:01pm

Thanks for the question @cvipdnn,

I think you will need to do some experiments with changing the learning rate. One rule of thumb - the starting learning rate for fine-tuning should be on the same order as the ending learning rate from training.

So, if you started at a 0.001 learning rate when training and then over time reduced that learning rate to say 0.00001. Then use that 0.00001 as a starting guideline for fine-tuning. Some intuition needs to be applied on top of this. To raise or lower this rate.

With too high a learning rate during fine-tuning we have seen models have difficulty converging.

cvipdnn · April 2, 2021, 4:42pm

Thanks. I refer to the sample code under nightlyTest/tensorflow and rewrote the code. It worked well. Three suggestions:

In document, tell the user these sample code.
2)Tensorflow 1.x is not easy to use, tensorflow 2.x will be much helpful.
so far, only supports down to 4 bits, how about 2 bits.

Thanks.

akhobare · April 16, 2021, 2:34am

Thanks @cvipdnn

For #1 - can you please fix the docs and create a pull request?

#2 - I know The plan is to move to TensorFlow 2.x over the next few months

#3 - There is nothing fundamentally preventing using QuantSim with 2 bits. There may be some sanity checks that you can work around. However, we have not seen great results with bw < 4, so thats the reason for adding the sanity checks

Topic		Replies	Views
[TF Visualization] Wrong reshaping of weights for range analysis of TF Models	2	736	January 8, 2021
QAT for ResNet101 PyTorch & TensorFlow	0	720	January 28, 2022
Compress Keras (TF) model	1	1047	July 22, 2021
Efficient-Net-B0 AIMET quantization	2	1159	July 28, 2021
Exception when using channel prunning with TF	1	873	March 9, 2021

Fine tuning with tensorflow

Related topics