Results for "expected return"
Diffusion model trained to remove noise step by step.
Repeating temporal patterns.
System that independently pursues goals over time.
Interleaving reasoning and tool use.
Average value under a distribution.
Measure of spread around the mean.
Sampling from easier distribution with reweighting.
Restricting updates to safe regions.
Assigning a role or identity to the model.
Applying learned patterns incorrectly.
Loss of old knowledge when learning new tasks.
Model trained on its own outputs degrades quality.
Probabilities do not reflect true correctness.
European regulation classifying AI systems by risk.
RL without explicit dynamics model.
RL using learned or known environment models.
Patient agreement to AI-assisted care.
Fast approximation of costly simulations.
Signals indicating dangerous behavior.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
Measures a model’s ability to fit random noise; used to bound generalization error.