Results for "learned objectives"

AdvertisementAd space — search-top

46 results

Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment
Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment
Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment
Hyperparameters Intermediate

Configuration choices not learned directly (or not typically learned) that govern training or architecture.

Optimization
Highway Network Intermediate

Early architecture using learned gates for skip connections.

AI Economics & Strategy
Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment
Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment
Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment
Outer Alignment Advanced

Correctly specifying goals.

AI Safety & Alignment
Corrigibility Advanced

Willingness of system to accept correction or shutdown.

AI Safety & Alignment
Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy
Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment
Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory
Model Intermediate

A parameterized mapping from inputs to outputs; includes architecture + learned parameters.

Foundations & Theory
Representation Learning Intermediate

Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.

Machine Learning
Overgeneralization Intermediate

Applying learned patterns incorrectly.

Model Failure Modes
Model-Based RL Advanced

RL using learned or known environment models.

Reinforcement Learning
World Model Frontier

Learned model of environment dynamics.

World Models & Cognition
Multi-Agent System Intermediate

Multiple agents interacting cooperatively or competitively.

AI Economics & Strategy
Emergent Coordination Intermediate

Coordination arising without explicit programming.

AI Economics & Strategy
Mode Collapse Advanced

Generator produces limited variety of outputs.

Diffusion & Generative Models
Hierarchical Planning Advanced

Decomposing goals into sub-tasks.

Agents & Autonomy
Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment
Change Management Intermediate

Governance of model changes.

Governance & Ethics
Model Orchestration Intermediate

Coordinating models, tools, and logic.

AI Economics & Strategy
Token Budgeting Intermediate

Limiting inference usage.

AI Economics & Strategy
Plant Intermediate

The physical system being controlled.

Foundations & Theory
Cooperative Game Advanced

Agents optimize collective outcomes.

Agents & Autonomy
Shutdown Problem Advanced

Ensuring AI allows shutdown.

AI Safety & Alignment
Power-Seeking Behavior Advanced

Tendency to gain control/resources.

AI Safety & Alignment

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.