Alignment
IntermediateEnsuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Full Definition
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Keywords
Domains
Related Terms
Concept Map
See how Alignment connects to other concepts.
Open Knowledge Graph