Definition
A momentum-based optimization method that provides a predictive update to the parameters before computing the gradient.
Detailed Explanation
Nesterov Accelerated Gradient (NAG) modifies traditional momentum optimization by calculating the gradient not at the current position but at an approximated future position. This 'look-ahead' approach helps prevent overshooting and provides better convergence rates than standard gradient descent with momentum.
Use Cases
Deep neural network training, Convex optimization problems, Large-scale machine learning