Browse: Optimization — AI Glossary

Adam Intermediate

Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.

DPO Intermediate

A preference-based training method optimizing policies directly from pairwise comparisons without explicit RL loops.

Empirical Risk Minimization Intermediate

Minimizing average loss on training data; can overfit when data is limited or biased.

Gradient Descent Intermediate

Iterative method that updates parameters in the direction of negative gradient to minimize loss.

Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Log Loss Intermediate

Penalizes confident wrong predictions heavily; standard for classification and language modeling.

Mean Squared Error Intermediate

Average of squared residuals; common regression objective.

Momentum Intermediate

Uses an exponential moving average of gradients to speed convergence and reduce oscillation.

Objective Function Intermediate

A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.

RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Domain: Optimization