Results for "training loss"
Model relies on irrelevant signals.
Performance drop when moving from simulation to reality.
Differences between training and deployed patient populations.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.
Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Converting text into discrete units (tokens) for modeling; subword tokenizers balance vocabulary size and coverage.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.
Generates sequences one token at a time, conditioning on past tokens.
Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.
Standardized documentation describing intended use, performance, limitations, data, and ethical considerations.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
System for running consistent evaluations across tasks, versions, prompts, and model settings.
System design where humans validate or guide model outputs, especially for high-stakes decisions.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
A model is PAC-learnable if it can, with high probability, learn an approximately correct hypothesis from finite samples.
Converting audio speech into text, often using encoder-decoder or transducer architectures.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.