Results for "risk-based regulation"
A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.
Measures a model’s ability to fit random noise; used to bound generalization error.
Exact likelihood generative models using invertible transforms.
Combines value estimation (critic) with policy learning (actor).
Learns the score (∇ log p(x)) for generative sampling.
Simple agent responding directly to inputs.
Dynamic resource allocation.
Continuous loop adjusting actions based on state feedback.
Algorithm computing control actions.
Achieving task performance by providing a small number of examples inside the prompt without weight updates.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Retrieval based on embedding similarity rather than keyword overlap, capturing paraphrases and related concepts.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
A preference-based training method optimizing policies directly from pairwise comparisons without explicit RL loops.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Identifying and localizing objects in images, often with confidence scores and bounding rectangles.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Continuous cycle of observation, reasoning, action, and feedback.
Chooses which experts process each token.
Separates planning from execution in agent architectures.
Detecting unauthorized model outputs or data leaks.
Models that define an energy landscape rather than explicit probabilities.
Probabilistic energy-based neural network with hidden variables.
Simultaneous Localization and Mapping for robotics.
Monte Carlo method for state estimation.
Flat high-dimensional regions slowing training.
Distributed agents producing emergent intelligence.
Guaranteed response times.