Results for "controlled updates"
Controlled experiment comparing variants by random assignment to estimate causal effects of changes.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
The physical system being controlled.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
Achieving task performance by providing a small number of examples inside the prompt without weight updates.
Combines value estimation (critic) with policy learning (actor).
Restricting updates to safe regions.
Learning only from current policy’s data.
Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.
Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.
Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.
Probability of treatment assignment given covariates.
Models effects of interventions (do(X=x)).
Variable enabling causal inference despite confounding.
Incrementally deploying new models to reduce risk.
Explicit output constraints (format, tone).
Governance of model changes.
Artificial environment for training/testing agents.
Artificial sensor data generated in simulation.
AI systems assisting clinicians with diagnosis or treatment decisions.
Isolating AI systems.
A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.
Learning where data arrives sequentially and the model updates continuously, often under changing distributions.
The learned numeric values of a model adjusted during training to minimize a loss function.
A function measuring prediction error (and sometimes calibration), guiding gradient-based optimization.
Iterative method that updates parameters in the direction of negative gradient to minimize loss.
Uses an exponential moving average of gradients to speed convergence and reduce oscillation.
Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
Gradients grow too large, causing divergence; mitigated by clipping, normalization, careful init.