Fine tuning with tensorflow

After running fine tuning , but the accuracy became worse after training. I don’t know why.
My network is very simple, just one layer, the training set is mnist. After training, the weight didn’t change, but the bias changed significantly.

  1. this is the model

    inputs = Input((IMG_SIZEX, IMG_SIZEY))
    x = Flatten()(inputs)
    output = Dense(10, activation = ‘softmax’)(x)
    model = Model(inputs, output)
    optimizer = Adam(lr=0.001)
    loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=False)
    model.compile(optimizer = optimizer,
    loss = loss_fn, metrics=[‘accuracy’])

2)The fine tuning code is below.

def training_helper(sim, generator):
g = sim.session.graph
sess = sim.session
with g.as_default():
x = sim.session.graph.get_tensor_by_name(“input_2:0”)
y = g.get_tensor_by_name(“dense_1/Softmax:0”)
#fc1_w = g.get_tensor_by_name(“/ReadVariableOp:0”)

    ce = g.get_tensor_by_name("loss/dense_1_loss/softmax_cross_entropy_with_logits/Reshape:0")
    # Using Adam optimizer
    train_step = tf.compat.v1.train.AdamOptimizer(1e-3, name="TempAdam").minimize(ce)
    graph_eval.initialize_uninitialized_vars(sess)
    # Input data for MNIST
    # Using 100 iterations and batch of size 100
    for j in range(10):
        for i in range(600):
            inputdata= x_train[i*100:i*100+100,:,:]
            outputdata = y_train[i*100:i*100+100]
            outputdata = tf.keras.utils.to_categorical(outputdata, 10)

            sess.run(train_step, feed_dict={x: inputdata, y:outputdata})

Thanks for the question @cvipdnn,

I think you will need to do some experiments with changing the learning rate. One rule of thumb - the starting learning rate for fine-tuning should be on the same order as the ending learning rate from training.

So, if you started at a 0.001 learning rate when training and then over time reduced that learning rate to say 0.00001. Then use that 0.00001 as a starting guideline for fine-tuning. Some intuition needs to be applied on top of this. To raise or lower this rate.

With too high a learning rate during fine-tuning we have seen models have difficulty converging.

Thanks. I refer to the sample code under nightlyTest/tensorflow and rewrote the code. It worked well. Three suggestions:

  1. In document, tell the user these sample code.
    2)Tensorflow 1.x is not easy to use, tensorflow 2.x will be much helpful.
  2. so far, only supports down to 4 bits, how about 2 bits.

Thanks.

Thanks @cvipdnn

For #1 - can you please fix the docs and create a pull request?

#2 - I know :slight_smile: The plan is to move to TensorFlow 2.x over the next few months

#3 - There is nothing fundamentally preventing using QuantSim with 2 bits. There may be some sanity checks that you can work around. However, we have not seen great results with bw < 4, so thats the reason for adding the sanity checks