Search: learned objectives

Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization

Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment

Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment

Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy

Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory

Model Intermediate

A parameterized mapping from inputs to outputs; includes architecture + learned parameters.

Foundations & Theory

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Computational Learning Theory Intermediate

A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.

AI Economics & Strategy

Highway Network Intermediate

Early architecture using learned gates for skip connections.

AI Economics & Strategy

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment

Overgeneralization Intermediate

Applying learned patterns incorrectly.

Model Failure Modes

Model-Based RL Advanced

RL using learned or known environment models.

Reinforcement Learning

World Model Frontier

Learned model of environment dynamics.

World Models & Cognition

Results for "learned objectives"