Answer by Chillston for How does stochastic gradient descent undo the...
I think this phrasing is a bit misleading. If I understand this passage correctly, another way to put it would be:Applying batch normalization distorts the true data distribution: An arbitrarily...
View ArticleHow does stochastic gradient descent undo the normalization done by the batch...
I want to understand the handshake between SGD (or mini-batch GD) and batch normalization. Below, an explanation quoted from this Medium article. However, I am confused about the denormalization by the...
View Article
More Pages to Explore .....