Search: safety science

Alignment Tax Advanced

Tradeoff between safety and performance.

AI Safety & Alignment

Differential Progress Intermediate

Accelerating safety relative to capabilities.

Governance & Ethics

Artificial Intelligence Intermediate

The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...

Foundations & Theory

Safety Filter Intermediate

Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).

Foundations & Theory

Safety-Critical System Advanced

Systems where failure causes physical harm.

Agents & Autonomy

Model Release Control Intermediate

Restricting distribution of powerful models.

Governance & Ethics

Alignment Research Intermediate

Research ensuring AI remains safe.

Governance & Ethics

Kill Switch Intermediate

Mechanism to disable AI system.

Governance & Ethics

Safety Envelope Frontier

Hard constraints preventing unsafe actions.

World Models & Cognition

Model Registry Intermediate

Central system to store model versions, metadata, approvals, and deployment state.

Foundations & Theory

Computational Learning Theory Intermediate

A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.

AI Economics & Strategy

Time Series Intermediate

Sequential data indexed by time.

Time Series

Robotics Advanced

Field combining mechanics, control, perception, and AI to build autonomous machines.

Robotics & Embodied AI

Predictive Coding Frontier

Learning by minimizing prediction error.

World Models & Cognition

Embodiment Hypothesis Advanced

Intelligence emerges from interaction with the physical world.

Agents & Autonomy

Sensorimotor Loop Advanced

Closed loop linking sensing and acting.

Agents & Autonomy

Developmental Robotics Advanced

Robots learning via exploration and growth.

Agents & Autonomy

Scientific ML Advanced

AI applied to scientific problems.

AI in Science

Materials Discovery Advanced

AI discovering new compounds/materials.

AI in Science

Cooperative Game Advanced

Agents optimize collective outcomes.

Agents & Autonomy

Nash Equilibrium Advanced

No agent benefits from unilateral deviation.

Agents & Autonomy

Information Cascades Advanced

Early signals disproportionately influence outcomes.

Dynamics & Physics

Polarization Advanced

Groups adopting extreme positions.

Dynamics & Physics

Formal Verification Advanced

Mathematical guarantees of system behavior.

Agents & Autonomy

Fast Takeoff Advanced

Sudden jump to superintelligence.

AI Safety & Alignment

Existential Risk Advanced

Risk threatening humanity’s survival.

AI Safety & Alignment

Chain-of-Thought Intermediate

Stepwise reasoning patterns that can improve multi-step tasks; often handled implicitly or summarized for safety/privacy.

Foundations & Theory

RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Optimization

Alignment Intermediate

Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.

Foundations & Theory

Guardrails Intermediate

Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.

Reinforcement Learning

Results for "safety science"

Welcome to AI Glossary

Search

Browse

3D WordGraph