Search: learned objectives

Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment

Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization

Highway Network Intermediate

Early architecture using learned gates for skip connections.

AI Economics & Strategy

Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment

Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment

Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment

Outer Alignment Advanced

Correctly specifying goals.

AI Safety & Alignment

Corrigibility Advanced

Willingness of system to accept correction or shutdown.

AI Safety & Alignment

Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy

Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment

Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory

Model Intermediate

A parameterized mapping from inputs to outputs; includes architecture + learned parameters.

Foundations & Theory

Representation Learning Intermediate

Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.

Machine Learning

Overgeneralization Intermediate

Applying learned patterns incorrectly.

Model Failure Modes

Model-Based RL Advanced

RL using learned or known environment models.

Reinforcement Learning

World Model Frontier

Learned model of environment dynamics.

World Models & Cognition

Multi-Agent System Intermediate

Multiple agents interacting cooperatively or competitively.

AI Economics & Strategy

Emergent Coordination Intermediate

Coordination arising without explicit programming.

AI Economics & Strategy

Mode Collapse Advanced

Generator produces limited variety of outputs.

Diffusion & Generative Models

Hierarchical Planning Advanced

Decomposing goals into sub-tasks.

Agents & Autonomy

Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment

Change Management Intermediate

Governance of model changes.

Governance & Ethics

Model Orchestration Intermediate

Coordinating models, tools, and logic.

AI Economics & Strategy

Token Budgeting Intermediate

Limiting inference usage.

AI Economics & Strategy

Plant Intermediate

The physical system being controlled.

Foundations & Theory

Cooperative Game Advanced

Agents optimize collective outcomes.

Agents & Autonomy

Shutdown Problem Advanced

Ensuring AI allows shutdown.

AI Safety & Alignment

Power-Seeking Behavior Advanced

Tendency to gain control/resources.

AI Safety & Alignment

Results for "learned objectives"

Welcome to AI Glossary

Search

Browse

3D WordGraph