What are the optimization algorithms typically used in a neural network ?

Gradient descent is the most commonly used training algorithm. Momentum is a common way to augment gradient descent such that gradient in each step is accumulated over past steps to enable the algorithm to proceed in a smoother fashion towards the minimum.  RMS prop attempts to adjust learning rate for each iteration in an automated fashion to help reduce oscillation in reaching minima. ADAM (Adaptive moment optimization) is a combination of RMS prop and momentum.



Leave a Reply

Your email address will not be published. Required fields are marked *