Results for "tokens/sec"

AdvertisementAd space — search-top

25 results

Masked Language Model Intermediate

Predicts masked tokens in a sequence, enabling bidirectional context; often used for embeddings rather than generation.

Foundations & Theory
Canary Tokens Intermediate

Detecting unauthorized model outputs or data leaks.

AI Economics & Strategy
Language Model Intermediate

A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.

Large Language Models
Top-k Intermediate

Samples from the k highest-probability tokens to limit unlikely outputs.

Foundations & Theory
Top-p Intermediate

Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.

Foundations & Theory
Positional Encoding Intermediate

Injects sequence order into Transformers, since attention alone is permutation-invariant.

Foundations & Theory
Next-Token Prediction Intermediate

Training objective where the model predicts the next token given previous tokens (causal modeling).

Foundations & Theory
Autoregressive Model Intermediate

Generates sequences one token at a time, conditioning on past tokens.

Foundations & Theory
Context Window Intermediate

Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.

Transformers & LLMs
Throughput Intermediate

How many requests or tokens can be processed per unit time; affects scalability and cost.

Transformers & LLMs
Causal Mask Intermediate

Prevents attention to future tokens during training/inference.

AI Economics & Strategy
Rotary Positional Embeddings Intermediate

Encodes positional information via rotation in embedding space.

AI Economics & Strategy
Absolute Positional Encoding Intermediate

Encodes token position explicitly, often via sinusoids.

AI Economics & Strategy
Sparse Attention Intermediate

Attention mechanisms that reduce quadratic complexity.

AI Economics & Strategy
Attention Intermediate

Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.

Transformers & LLMs
Tokenization Intermediate

Converting text into discrete units (tokens) for modeling; subword tokenizers balance vocabulary size and coverage.

Foundations & Theory
Vocabulary Intermediate

The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.

Transformers & LLMs
Sampling Intermediate

Stochastic generation strategies that trade determinism for diversity; key knobs include temperature and nucleus sampling.

Foundations & Theory
Context Compression Intermediate

Techniques to handle longer documents without quadratic cost.

AI Economics & Strategy
Token Budgeting Intermediate

Limiting inference usage.

AI Economics & Strategy
Transformer Intermediate

Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.

Transformers & LLMs
Beam Search Intermediate

Search algorithm for generation that keeps top-k partial sequences; can improve likelihood but reduce diversity.

Foundations & Theory
Key-Value Cache Intermediate

Stores past attention states to speed up autoregressive decoding.

AI Economics & Strategy
Vision Transformer Intermediate

Transformer applied to image patches.

Computer Vision
Inference Cost Intermediate

Cost to run models in production.

AI Economics & Strategy

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.