Results for "hidden objectives"
Ensuring learned behavior matches intended objective.
Maintaining alignment under new conditions.
Model behaves well during training but not deployment.
Governance of model changes.
Coordinating models, tools, and logic.
Limiting inference usage.
The physical system being controlled.
Agents optimize collective outcomes.
Ensuring AI allows shutdown.
Tendency to gain control/resources.
Intelligence and goals are independent.
Goals useful regardless of final objective.
Research ensuring AI remains safe.
Designing AI to cooperate with humans and each other.
Risk threatening humanity’s survival.