Results for "safety"
Safety Filter
IntermediateAutomated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).
A safety filter acts like a bouncer at a club, checking if people can enter based on certain rules. In the world of AI, it checks the outputs generated by a model to make sure they are safe and appropriate. For example, if an AI is asked to write a story, the safety filter will ensure that it doe...
Software regulated as a medical device.
AI capable of performing most intellectual tasks humans can.
Existential risk from AI systems.
Incremental capability growth.
Isolating AI systems.
Signals indicating dangerous behavior.
Ensuring AI allows shutdown.
Tendency to gain control/resources.
International agreements on AI.
Restricting distribution of powerful models.
Research ensuring AI remains safe.