Domain: Transformers & LLMs
Attention
Intermediate
Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.
Context Window
Intermediate
Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.
Self-Attention
Intermediate
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
Throughput
Intermediate
How many requests or tokens can be processed per unit time; affects scalability and cost.
Transformer
Intermediate
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Vocabulary
Intermediate
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.