AdaDelta | AI Glossary

Definition

An extension of AdaGrad that adapts learning rates based on a moving window of gradient updates to prevent learning rate decay.

Detailed Explanation

AdaDelta dynamically adapts over time by restricting the window of accumulated past gradients to a fixed size. It stores an exponentially decaying average of squared gradients and squared parameter updates, eliminating the need to set an initial learning rate. This approach helps prevent the learning rate from monotonically decreasing like in AdaGrad.

Use Cases

Deep learning model training, Online machine learning, Robust optimization tasks

Definition

Detailed Explanation

Use Cases

Related Terms

Self-Supervised Learning

Neural Architecture Search

Pruning

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool