Search: reliability — AI Glossary

Tool Use Intermediate

Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.

Agents & Autonomy

Grounding Intermediate

Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.

Foundations & Theory

Inter-Annotator Agreement Intermediate

Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.

Foundations & Theory

Safety-Critical System Advanced

Systems where failure causes physical harm.

Agents & Autonomy

Domain Shift Intermediate

A mismatch between training and deployment data distributions that can degrade model performance.

MLOps & Infrastructure

Precision Intermediate

Of predicted positives, the fraction that are truly positive; sensitive to false positives.

Foundations & Theory

Calibration Intermediate

The degree to which predicted probabilities match true frequencies (e.g., 0.8 means ~80% correct).

Foundations & Theory

Brier Score Intermediate

A proper scoring rule measuring squared error of predicted probabilities for binary outcomes.

Evaluation & Benchmarking

RAG Intermediate

Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.

Foundations & Theory

CI/CD for ML Intermediate

Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.

MLOps & Infrastructure

Hallucination Intermediate

Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.

Model Failure Modes

Observability Intermediate

A broader capability to infer internal system state from telemetry, crucial for AI services and agents.

Evaluation & Benchmarking

Audit Intermediate

Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.

Governance & Ethics

Eval Harness Intermediate

System for running consistent evaluations across tasks, versions, prompts, and model settings.

Foundations & Theory

Red Teaming Intermediate

Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.

Security & Privacy

Data Poisoning Intermediate

Maliciously inserting or altering training data to implant backdoors or degrade performance.

Foundations & Theory

Human-in-the-Loop Intermediate

System design where humans validate or guide model outputs, especially for high-stakes decisions.

Foundations & Theory

Automation Bias Intermediate

Tendency to trust automated suggestions even when incorrect; mitigated by UI design, training, and checks.

Foundations & Theory

Structured Output Intermediate

Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.

Foundations & Theory

Cross-Entropy Intermediate

Measures divergence between true and predicted probability distributions.

AI Economics & Strategy

Bias Term Intermediate

Systematic error introduced by simplifying assumptions in a learning algorithm.

AI Economics & Strategy

Self-Reflection Intermediate

Models evaluating and improving their own outputs.

AI Economics & Strategy

Human Oversight Intermediate

Required human review for high-risk decisions.

AI Economics & Strategy

Blue-Green Deployment Intermediate

Maintaining two environments for instant rollback.

MLOps & Infrastructure

Feature Store Intermediate

Centralized repository for curated features.

MLOps & Infrastructure

Variance Advanced

Measure of spread around the mean.

Probability & Statistics

Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment

Self-Consistency Intro

Sampling multiple outputs and selecting consensus.

Prompting & Instructions

Overgeneralization Intermediate

Applying learned patterns incorrectly.

Model Failure Modes

Feedback Loop Collapse Intermediate

Model trained on its own outputs degrades quality.

Model Failure Modes

Results for "reliability"

Welcome to AI Glossary

Search

Browse

3D WordGraph