Results for "ground truth"
SFT
Intermediate
Fine-tuning on (prompt, response) pairs to align a model with instruction-following behaviors.
Data Labeling
Intermediate
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
Exposure Bias
Intermediate
Differences between training and inference conditions.