Results for "bounded behavior"
Groups adopting extreme positions.
Tendency to gain control/resources.
Agents copy others’ actions.
Ensuring learned behavior matches intended objective.
The text (and possibly other modalities) given to an LLM to condition its output behavior.
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Learning action mapping directly from demonstrations.
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Inferring reward function from observed behavior.
Distributed agents producing emergent intelligence.
Collective behavior without central control.
Emergence of conventions among agents.
Signals indicating dangerous behavior.
Local surrogate explanation method approximating model behavior near a specific input.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Hidden behavior activated by specific triggers, causing targeted mispredictions or undesired outputs.
Mathematical framework for controlling dynamic systems.
Model behaves well during training but not deployment.
The physical system being controlled.
Equations governing how system states change over time.
Modeling interactions with environment.
Human-like understanding of physical behavior.
Mechanics of price formation.
Modeling chemical systems computationally.
System-level behavior arising from interactions.
Research ensuring AI remains safe.
Decisions dependent on others’ actions.
A mismatch between training and deployment data distributions that can degrade model performance.
The learned numeric values of a model adjusted during training to minimize a loss function.