TAAFT
Free mode
100% free
Freemium
Free Trial
Deals
Create tool

AdaGrad

[ˈædəˌɡræd]
Machine Learning
Last updated: December 9, 2024

Definition

An optimization algorithm that adapts the learning rate to the parameters, performing smaller updates for frequently occurring features and larger updates for infrequent ones.

Detailed Explanation

AdaGrad (Adaptive Gradient Algorithm) adapts the learning rate to each parameter by scaling it inversely proportional to the square root of the sum of all historical squared values of the gradient. This approach performs well with sparse data but can cause the learning rate to become infinitesimally small over time, effectively stopping learning.

Use Cases

Natural language processing tasks, Sparse data optimization, Online convex optimization

Related Terms