Results for "representation learning"
Representation Learning
IntermediateAutomatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Representation learning is like teaching a computer to understand the essence of data without needing someone to explain every detail. Imagine trying to recognize different animals in pictures. Instead of manually pointing out features like fur color or size, a representation learning model can a...
A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.
System for running consistent evaluations across tasks, versions, prompts, and model settings.
Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.
Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
Identifying and localizing objects in images, often with confidence scores and bounding rectangles.
Error due to sensitivity to fluctuations in the training dataset.
Converting audio speech into text, often using encoder-decoder or transducer architectures.
Measures divergence between true and predicted probability distributions.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Measures how one probability distribution diverges from another.
Estimating parameters by maximizing likelihood of observed data.
Updating beliefs about parameters using observed evidence and prior distributions.
Bayesian parameter estimation using the mode of the posterior distribution.
Optimization problems where any local minimum is global.
A point where gradient is zero but is neither a max nor min; common in deep nets.
A wide basin often correlated with better generalization.
Limiting gradient magnitude to prevent exploding gradients.
Matrix of second derivatives describing local curvature of loss.
Allows gradients to bypass layers, enabling very deep networks.
The range of functions a model can represent.
Capabilities that appear only beyond certain model sizes.
Extending agents with long-term memory stores.
Optimizing policies directly via gradient ascent on expected reward.
Multiple agents interacting cooperatively or competitively.
Models trained to decide when to call tools.
Coordination arising without explicit programming.
Ensuring decisions can be explained and traced.
Legal or policy requirement to explain AI decisions.
Recovering training data from gradients.