AdaGrad | AI Glossary

Definition

An optimization algorithm that adapts the learning rate to the parameters, performing smaller updates for frequently occurring features and larger updates for infrequent ones.

Detailed Explanation

AdaGrad (Adaptive Gradient Algorithm) adapts the learning rate to each parameter by scaling it inversely proportional to the square root of the sum of all historical squared values of the gradient. This approach performs well with sparse data but can cause the learning rate to become infinitesimally small over time, effectively stopping learning.

Use Cases

Natural language processing tasks, Sparse data optimization, Online convex optimization

Definition

Detailed Explanation

Use Cases

Related Terms

UMAP

Simulated Annealing

Curriculum Learning

Help

People also viewed