Results for "decoding"
Beam Search
Intermediate
Search algorithm for generation that keeps top-k partial sequences; can improve likelihood but reduce diversity.
Logits
Intermediate
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Key-Value Cache
Intermediate
Stores past attention states to speed up autoregressive decoding.
Top-k
Intermediate
Samples from the k highest-probability tokens to limit unlikely outputs.