2 results
Minimizing average loss on training data; can overfit when data is limited or biased.
Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).