Results for "safety science"
Chain-of-Thought
Intermediate
Stepwise reasoning patterns that can improve multi-step tasks; often handled implicitly or summarized for safety/privacy.
Alignment Tax
Advanced
Tradeoff between safety and performance.
Differential Progress
Intermediate
Accelerating safety relative to capabilities.
Safety Filter
Intermediate
Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).
Safety Envelope
Frontier
Hard constraints preventing unsafe actions.
Physical Safety
Frontier
Ensuring robots do not harm humans.
Safety-Critical System
Advanced
Systems where failure causes physical harm.