Results for "metrics"

AdvertisementAd space — search-top

74 results

Confusion Matrix Intermediate

A table summarizing classification outcomes, foundational for metrics like precision, recall, specificity.

Foundations & Theory
Class Imbalance Intermediate

When some classes are rare, requiring reweighting, resampling, or specialized metrics.

Machine Learning
ROC Curve Intermediate

Plots true positive rate vs false positive rate across thresholds; summarizes separability.

Foundations & Theory
Model Card Intermediate

Standardized documentation describing intended use, performance, limitations, data, and ethical considerations.

Foundations & Theory
Audit Intermediate

Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.

Governance & Ethics
Benchmark Intermediate

A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.

Evaluation & Benchmarking
Monitoring Intermediate

Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.

MLOps & Infrastructure
Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment
Model Documentation Intermediate

Required descriptions of model behavior and limits.

Governance & Ethics
Generalization Intermediate

How well a model performs on new data drawn from the same (or similar) distribution as training.

Foundations & Theory
Train/Validation/Test Split Intermediate

Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.

Evaluation & Benchmarking
Accuracy Intermediate

Fraction of correct predictions; can be misleading on imbalanced datasets.

Foundations & Theory
Data Leakage Intermediate

When information from evaluation data improperly influences training, inflating reported performance.

Foundations & Theory
Calibration Intermediate

The degree to which predicted probabilities match true frequencies (e.g., 0.8 means ~80% correct).

Foundations & Theory
Prompt Engineering Intermediate

Crafting prompts to elicit desired behavior, often using role, structure, constraints, and examples.

Prompting & Instructions
Chain-of-Thought Intermediate

Stepwise reasoning patterns that can improve multi-step tasks; often handled implicitly or summarized for safety/privacy.

Foundations & Theory
Semantic Search Intermediate

Retrieval based on embedding similarity rather than keyword overlap, capturing paraphrases and related concepts.

Foundations & Theory
Safety Filter Intermediate

Automated detection/prevention of disallowed outputs (toxicity, self-harm, illegal instruction, etc.).

Foundations & Theory
Bias Intermediate

Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.

Foundations & Theory
A/B Testing Intermediate

Controlled experiment comparing variants by random assignment to estimate causal effects of changes.

Foundations & Theory
Data Governance Intermediate

Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.

Foundations & Theory
MLOps Intermediate

Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.

MLOps & Infrastructure
CI/CD for ML Intermediate

Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.

MLOps & Infrastructure
Model Registry Intermediate

Central system to store model versions, metadata, approvals, and deployment state.

Foundations & Theory
Experiment Tracking Intermediate

Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.

Evaluation & Benchmarking
Observability Intermediate

A broader capability to infer internal system state from telemetry, crucial for AI services and agents.

Evaluation & Benchmarking
Compute Intermediate

Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.

Foundations & Theory
Eval Harness Intermediate

System for running consistent evaluations across tasks, versions, prompts, and model settings.

Foundations & Theory
Responsible AI Intermediate

A discipline ensuring AI systems are fair, safe, transparent, privacy-preserving, and accountable throughout lifecycle.

Governance & Ethics
Emergent Abilities Intermediate

Capabilities that appear only beyond certain model sizes.

AI Economics & Strategy

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.