we want to inquiry about the stochastic method.
What exactly is the AIMET’s stochastic method? Is that a different from AdaRound?
If there were any sample code will be helpful.
Hi @herokuvps - stochastic rounding is like the name suggests a rounding method that randomly chooses rounding per tensor value. This is meant to be used during QAT.
To be fair - we have not seen compelling results with this scheme and I will suggest to avoid it.
AdaRound has some similarities to stochastic rounding - the big difference being that AdaRound is a standalone post-training technique that learns how to round. So in a way AdaRound tries to chase the optimal stochastic rounding. We have seen excellent results with AdaRound.