Results for "long context"
Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.
Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.
Extending agents with long-term memory stores.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
Predicts masked tokens in a sequence, enabling bidirectional context; often used for embeddings rather than generation.
Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Mechanisms for retaining context across turns/sessions: scratchpads, vector memories, structured stores.
Using markers to isolate context segments.
Techniques to handle longer documents without quadratic cost.