Search: inference acceleration

Kinematics Advanced

Study of motion without considering forces.

Dynamics & Physics

Secure Inference Intermediate

Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.

Foundations & Theory

Inference Pipeline Intermediate

Model execution path in production.

MLOps & Infrastructure

Inference Cost Intermediate

Cost to run models in production.

AI Economics & Strategy

Causal Inference Intermediate

Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.

Foundations & Theory

Online Inference Intermediate

Low-latency prediction per request.

MLOps & Infrastructure

Batch Inference Intermediate

Running predictions on large datasets periodically.

MLOps & Infrastructure

Exposure Bias Intermediate

Differences between training and inference conditions.

Model Failure Modes

Active Inference Frontier

Acting to minimize surprise or free energy.

World Models & Cognition

Compute Intermediate

Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.

Foundations & Theory

Edge Inference Intermediate

Running models locally.

AI Economics & Strategy

Latency Intermediate

Time from request to response; critical for real-time inference and UX.

Foundations & Theory

Confounding Intermediate

A hidden variable influences both cause and effect, biasing naive estimates of causal impact.

Foundations & Theory

Quantization Intermediate

Reducing numeric precision of weights/activations to speed inference and reduce memory with acceptable accuracy loss.

Foundations & Theory

Bayesian Inference Intermediate

Updating beliefs about parameters using observed evidence and prior distributions.

AI Economics & Strategy

Causal Mask Intermediate

Prevents attention to future tokens during training/inference.

AI Economics & Strategy

Variational Autoencoder Advanced

Autoencoder using probabilistic latent variables and KL regularization.

Diffusion & Generative Models

Instrumental Variable Advanced

Variable enabling causal inference despite confounding.

Causal AI & Interpretability

Likelihood Function Advanced

Probability of data given parameters.

Probability & Statistics

Token Budgeting Intermediate

Limiting inference usage.

AI Economics & Strategy

Tokenization Intermediate

Converting text into discrete units (tokens) for modeling; subword tokenizers balance vocabulary size and coverage.

Foundations & Theory

A/B Testing Intermediate

Controlled experiment comparing variants by random assignment to estimate causal effects of changes.

Foundations & Theory

Active Learning Intermediate

Selecting the most informative samples to label (e.g., uncertainty sampling) to reduce labeling cost.

Foundations & Theory

Observability Intermediate

A broader capability to infer internal system state from telemetry, crucial for AI services and agents.

Evaluation & Benchmarking

Throughput Intermediate

How many requests or tokens can be processed per unit time; affects scalability and cost.

Transformers & LLMs

Privacy Attack Intermediate

Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.

Foundations & Theory

Human-in-the-Loop Intermediate

System design where humans validate or guide model outputs, especially for high-stakes decisions.

Foundations & Theory

KL Divergence Intermediate

Measures how one probability distribution diverges from another.

AI Economics & Strategy

Maximum Likelihood Estimation Intermediate

Estimating parameters by maximizing likelihood of observed data.

AI Economics & Strategy

Key-Value Cache Intermediate

Stores past attention states to speed up autoregressive decoding.

AI Economics & Strategy

Results for "inference acceleration"

Welcome to AI Glossary

Search

Browse

3D WordGraph