Results for "safety science"

AdvertisementAd space — search-top

56 results

Red Teaming Intermediate

Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.

Security & Privacy
Alignment Problem Advanced

Ensuring AI systems pursue intended human goals.

AI Safety & Alignment
Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment
Instrumental Convergence Advanced

Tendency for agents to pursue resources regardless of final goal.

AI Safety & Alignment
Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment
Outer Alignment Advanced

Correctly specifying goals.

AI Safety & Alignment
Mesa-Optimizer Advanced

Learned subsystem that optimizes its own objective.

AI Safety & Alignment
Deceptive Alignment Advanced

Model behaves well during training but not deployment.

AI Safety & Alignment
Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment
Corrigibility Advanced

Willingness of system to accept correction or shutdown.

AI Safety & Alignment
EU AI Act Intermediate

European regulation classifying AI systems by risk.

Governance & Ethics
High-Risk AI System Intermediate

AI used in sensitive domains requiring compliance.

Governance & Ethics
Change Management Intermediate

Governance of model changes.

Governance & Ethics
Shared Autonomy Frontier

Control shared between human and agent.

World Models & Cognition
Physical Safety Frontier

Ensuring robots do not harm humans.

World Models & Cognition
Clinical Validation Intermediate

Testing AI under actual clinical conditions.

AI in Healthcare
FDA Clearance Intermediate

US approval process for medical AI devices.

AI in Healthcare
SaMD Intermediate

Software regulated as a medical device.

AI in Healthcare
Artificial General Intelligence Frontier

AI capable of performing most intellectual tasks humans can.

AGI & General Intelligence
x-Risk Advanced

Existential risk from AI systems.

AI Safety & Alignment
Slow Takeoff Advanced

Incremental capability growth.

AI Safety & Alignment
AI Boxing Advanced

Isolating AI systems.

AI Safety & Alignment
Tripwire Advanced

Signals indicating dangerous behavior.

AI Safety & Alignment
Shutdown Problem Advanced

Ensuring AI allows shutdown.

AI Safety & Alignment
Power-Seeking Behavior Advanced

Tendency to gain control/resources.

AI Safety & Alignment
AI Treaty Intermediate

International agreements on AI.

Governance & Ethics

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.