Results for "aligned agents"
Combination of cooperation and competition.
Multiple agents interacting cooperatively or competitively.
Distributed agents producing emergent intelligence.
Agents fail to coordinate optimally.
Designing systems where rational agents behave as desired.
Shift in model outputs.
Ensuring AI systems pursue intended human goals.
Model optimizes objectives misaligned with human values.
Correctly specifying goals.
Learned subsystem that optimizes its own objective.
Model behaves well during training but not deployment.
Willingness of system to accept correction or shutdown.
Governance of model changes.
Signals indicating dangerous behavior.
Learning only from current policy’s data.
Extending agents with long-term memory stores.
System that independently pursues goals over time.
Decomposing goals into sub-tasks.
Simple agent responding directly to inputs.
Agents communicate via shared state.
Tendency for agents to pursue resources regardless of final goal.
Competition arises without explicit design.
AI tacitly coordinating prices.
Some agents know more than others.
Designing AI to cooperate with humans and each other.
Artificial environment for training/testing agents.
Emergence of conventions among agents.
A system that perceives state, selects actions, and pursues goals—often combining LLM reasoning with tools and memory.
A broader capability to infer internal system state from telemetry, crucial for AI services and agents.
Coordination arising without explicit programming.