Search: data mismatch — AI Glossary

Domain Shift Intermediate

A mismatch between training and deployment data distributions that can degrade model performance.

MLOps & Infrastructure

Distribution Shift Intermediate

Train/test environment mismatch.

Model Failure Modes

Alignment Problem Advanced

Ensuring AI systems pursue intended human goals.

AI Safety & Alignment

Exposure Bias Intermediate

Differences between training and inference conditions.

Model Failure Modes

Data Governance Intermediate

Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.

Foundations & Theory

Data Lineage Intermediate

Tracking where data came from and how it was transformed; key for debugging and compliance.

Foundations & Theory

Synthetic Data Intermediate

Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.

Foundations & Theory

Data Augmentation Intermediate

Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.

Foundations & Theory

Federated Learning Intermediate

Training across many devices/silos without centralizing raw data; aggregates updates, not data.

Foundations & Theory

Data Scaling Intermediate

Increasing performance via more data.

AI Economics & Strategy

Encryption in Transit/At Rest Intermediate

Protecting data during network transfer and while stored; essential for ML pipelines handling sensitive data.

Security & Privacy

Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory

Data Leakage Intermediate

When information from evaluation data improperly influences training, inflating reported performance.

Foundations & Theory

Data Drift Intermediate

Shift in feature distribution over time.

MLOps & Infrastructure

Data Poisoning Intermediate

Maliciously inserting or altering training data to implant backdoors or degrade performance.

Foundations & Theory

Data Protection Impact Assessment Intermediate

Privacy risk analysis under GDPR-like laws.

Governance & Ethics

Unsupervised Learning Intermediate

Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.

Machine Learning

Self-Supervised Learning Intermediate

Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.

Machine Learning

Overfitting Intermediate

When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.

Foundations & Theory

PII Intermediate

Information that can identify an individual (directly or indirectly); requires careful handling and compliance.

Foundations & Theory

Model Inversion Intermediate

Inferring sensitive features of training data.

AI Economics & Strategy

Semi-Supervised Learning Intermediate

Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.

Machine Learning

Online Learning Intermediate

Learning where data arrives sequentially and the model updates continuously, often under changing distributions.

Machine Learning

Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory

Feature Engineering Intermediate

Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.

Foundations & Theory

Differential Privacy Intermediate

A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.

Security & Privacy

Scaling Laws Intermediate

Empirical laws linking model size, data, compute to performance.

AI Economics & Strategy

Gradient Leakage Intermediate

Recovering training data from gradients.

AI Economics & Strategy

Diffusion Model Advanced

Generative model that learns to reverse a gradual noise process.

Diffusion & Generative Models

Denoising Diffusion Probabilistic Model Advanced

Diffusion model trained to remove noise step by step.

Diffusion & Generative Models

Results for "data mismatch"

Welcome to AI Glossary

Search

Browse

3D WordGraph