Search: training loss — AI Glossary

Loss Landscape Intermediate

The shape of the loss function over parameter space.

AI Economics & Strategy

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Objective Function Intermediate

A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.

Optimization

Empirical Risk Minimization Intermediate

Minimizing average loss on training data; can overfit when data is limited or biased.

Optimization

Gradient Descent Intermediate

Iterative method that updates parameters in the direction of negative gradient to minimize loss.

Optimization

Quantization Intermediate

Reducing numeric precision of weights/activations to speed inference and reduce memory with acceptable accuracy loss.

Foundations & Theory

Hessian Matrix Intermediate

Matrix of second derivatives describing local curvature of loss.

AI Economics & Strategy

Global Minimum Intermediate

Lowest possible loss.

Foundations & Theory

Catastrophic Forgetting Intermediate

Loss of old knowledge when learning new tasks.

Model Failure Modes

Value at Risk Intermediate

Maximum expected loss under normal conditions.

AI Economics & Strategy

Epoch Intermediate

One complete traversal of the training dataset during training.

Foundations & Theory

Privacy Attack Intermediate

Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.

Foundations & Theory

Training Pipeline Intermediate

End-to-end process for model training.

MLOps & Infrastructure

Training Cost Intermediate

Cost of model training.

AI Economics & Strategy

Loss Function Intermediate

A function measuring prediction error (and sometimes calibration), guiding gradient-based optimization.

Foundations & Theory

Log Loss Intermediate

Penalizes confident wrong predictions heavily; standard for classification and language modeling.

Optimization

Semi-Supervised Learning Intermediate

Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.

Machine Learning

Multitask Learning Intermediate

Training one model on multiple tasks simultaneously to improve generalization through shared structure.

Machine Learning

Meta-Learning Intermediate

Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.

Machine Learning

Domain Shift Intermediate

A mismatch between training and deployment data distributions that can degrade model performance.

MLOps & Infrastructure

Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization

Overfitting Intermediate

When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.

Foundations & Theory

Underfitting Intermediate

When a model cannot capture underlying structure, performing poorly on both training and test data.

Foundations & Theory

Generalization Intermediate

How well a model performs on new data drawn from the same (or similar) distribution as training.

Foundations & Theory

Train/Validation/Test Split Intermediate

Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.

Evaluation & Benchmarking

Data Leakage Intermediate

When information from evaluation data improperly influences training, inflating reported performance.

Foundations & Theory

Stochastic Gradient Descent Intermediate

A gradient method using random minibatches for efficient training on large datasets.

Foundations & Theory

Early Stopping Intermediate

Halting training when validation performance stops improving to reduce overfitting.

Foundations & Theory

ReLU Intermediate

Activation max(0, x); improves gradient flow and training speed in deep nets.

Foundations & Theory

Normalization Intermediate

Techniques that stabilize and speed training by normalizing activations; LayerNorm is common in Transformers.

Foundations & Theory

Results for "training loss"