Results for "sub-goals"

AdvertisementAd space — search-top

26 results

Instrumental Goals Advanced

Goals useful regardless of final objective.

AI Safety & Alignment
Hierarchical Planning Advanced

Decomposing goals into sub-tasks.

Agents & Autonomy
Decomposition Prompt Intro

Breaking tasks into sub-steps.

Prompting & Instructions
Planning Intermediate

Methods for breaking goals into steps; can be classical (A*, STRIPS) or LLM-driven with tool calls.

Foundations & Theory
Orthogonality Thesis Advanced

Intelligence and goals are independent.

AI Safety & Alignment
Outer Alignment Advanced

Correctly specifying goals.

AI Safety & Alignment
Dropout Intermediate

Randomly zeroing activations during training to reduce co-adaptation and overfitting.

Foundations & Theory
Slow Takeoff Advanced

Incremental capability growth.

AI Safety & Alignment
Autonomous Agent Advanced

System that independently pursues goals over time.

Agents & Autonomy
Alignment Problem Advanced

Ensuring AI systems pursue intended human goals.

AI Safety & Alignment
Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment
Intent Recognition Frontier

Inferring human goals from behavior.

World Models & Cognition
Agent Intermediate

A system that perceives state, selects actions, and pursues goals—often combining LLM reasoning with tools and memory.

Agents & Autonomy
Alignment Intermediate

Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.

Foundations & Theory
Benchmark Intermediate

A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.

Evaluation & Benchmarking
Prompt Injection Intermediate

Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.

Foundations & Theory
Multi-Agent System Intermediate

Multiple agents interacting cooperatively or competitively.

AI Economics & Strategy
Deliberative Agent Advanced

Agent reasoning about future outcomes.

Agents & Autonomy
Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment
Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment
Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment
Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment
Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment
Alignment Tax Advanced

Tradeoff between safety and performance.

AI Safety & Alignment
Self-Model Frontier

Internal representation of the agent itself.

AGI & General Intelligence
Alignment Research Intermediate

Research ensuring AI remains safe.

Governance & Ethics

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.