Results for "long context"
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
Injects sequence order into Transformers, since attention alone is permutation-invariant.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
The text (and possibly other modalities) given to an LLM to condition its output behavior.
Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.
Retrieval based on embedding similarity rather than keyword overlap, capturing paraphrases and related concepts.
Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.
Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.
Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.
Central system to store model versions, metadata, approvals, and deployment state.
A broader capability to infer internal system state from telemetry, crucial for AI services and agents.
Time from request to response; critical for real-time inference and UX.
Stochastic generation strategies that trade determinism for diversity; key knobs include temperature and nucleus sampling.
Exponential of average negative log-likelihood; lower means better predictive fit, not necessarily better utility.
A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.
Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.
Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.
Coordinating tools, models, and steps (retrieval, calls, validation) to deliver reliable end-to-end behavior.
Methods for breaking goals into steps; can be classical (A*, STRIPS) or LLM-driven with tool calls.
Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.
Systematic error introduced by simplifying assumptions in a learning algorithm.
Error due to sensitivity to fluctuations in the training dataset.
A measure of randomness or uncertainty in a probability distribution.
Measures how one probability distribution diverges from another.
Bayesian parameter estimation using the mode of the posterior distribution.
A narrow minimum often associated with poorer generalization.