2 results
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Stores past attention states to speed up autoregressive decoding.