Momentum

Intermediate

Uses an exponential moving average of gradients to speed convergence and reduce oscillation.

AdvertisementAd space — term-top

Why It Matters

Momentum is important in optimizing machine learning models because it accelerates convergence and improves stability during training. By reducing oscillations, it allows models to learn more efficiently, which is especially beneficial in complex problems like deep learning. This technique is widely adopted in various applications, enhancing the performance of AI systems across industries.

Momentum is an optimization technique that accelerates the convergence of gradient descent algorithms by incorporating an exponentially decaying average of past gradients. The update rule can be expressed as v(t) = βv(t-1) + (1 - β)∇L(θ(t)), where v(t) is the velocity, β is the momentum coefficient (typically set between 0.5 and 0.9), and ∇L(θ(t)) is the gradient of the loss function at iteration t. The parameter update then follows θ(t+1) = θ(t) - ηv(t), where η is the learning rate. This method helps to smooth out the oscillations in the parameter updates, particularly in ravines of the loss landscape, leading to faster convergence. Momentum is often used in conjunction with other optimization methods, such as Stochastic Gradient Descent, to enhance performance.

Keywords

Domains

Related Terms

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.