Results for "MDP"
Markov Decision Process
Intermediate
Formal framework for sequential decision-making under uncertainty.
RLHF
Intermediate
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Action Space
Intermediate
Set of all actions available to the agent.
Shared Autonomy
Frontier
Control shared between human and agent.
Active Experimentation
Advanced
AI selecting next experiments.