Search: parallel attention

Attention Head Intermediate

A single attention mechanism within multi-head attention.

AI Economics & Strategy

Self-Attention Intermediate

Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.

Transformers & LLMs

Sparse Attention Intermediate

Attention mechanisms that reduce quadratic complexity.

AI Economics & Strategy

Graph Attention Network Intermediate

GNN using attention to weight neighbor contributions dynamically.

Model Architectures

Cross-Attention Intermediate

Attention between different modalities.

Computer Vision

Transformer Intermediate

Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.

Transformers & LLMs

Positional Encoding Intermediate

Injects sequence order into Transformers, since attention alone is permutation-invariant.

Foundations & Theory

Interpretability Intermediate

Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).

Foundations & Theory

Causal Mask Intermediate

Prevents attention to future tokens during training/inference.

AI Economics & Strategy

Key-Value Cache Intermediate

Stores past attention states to speed up autoregressive decoding.

AI Economics & Strategy

Attention Intermediate

Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.

Transformers & LLMs

Multi-Head Attention Intermediate

Allows model to attend to information from different subspaces simultaneously.

AI Economics & Strategy

Results for "parallel attention"