Reward Model
IntermediateModel trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Full Definition
Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Keywords
Domains
Related Terms
Concept Map
See how Reward Model connects to other concepts.
Open Knowledge Graph