Results for "data distribution"

75 results

Generalization Intermediate

How well a model performs on new data drawn from the same (or similar) distribution as training.

Foundations & Theory
Data Drift Intermediate

Shift in feature distribution over time.

MLOps & Infrastructure
Entropy Intermediate

A measure of randomness or uncertainty in a probability distribution.

AI Economics & Strategy
KL Divergence Intermediate

Measures how one probability distribution diverges from another.

AI Economics & Strategy
MAP Estimation Intermediate

Bayesian parameter estimation using the mode of the posterior distribution.

AI Economics & Strategy
Factor Graph Intermediate

Graphical model expressing factorization of a probability distribution.

Model Architectures
Expectation Advanced

Average value under a distribution.

Probability & Statistics
Central Limit Theorem Advanced

Sum of independent variables converges to normal distribution.

Probability & Statistics
Importance Sampling Advanced

Sampling from easier distribution with reweighting.

Probability & Statistics
Model Release Control Intermediate

Restricting distribution of powerful models.

Governance & Ethics
Posterior Distribution Advanced

Updated belief after observing data.

Probability & Statistics
Prior Distribution Advanced

Belief before observing data.

Probability & Statistics
Unsupervised Learning Intermediate

Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.

Machine Learning
Empirical Risk Minimization Intermediate

Minimizing average loss on training data; can overfit when data is limited or biased.

Optimization
Overfitting Intermediate

When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.

Foundations & Theory
Federated Learning Intermediate

Training across many devices/silos without centralizing raw data; aggregates updates, not data.

Foundations & Theory
Encryption in Transit/At Rest Intermediate

Protecting data during network transfer and while stored; essential for ML pipelines handling sensitive data.

Security & Privacy
Data Leakage Intermediate

When information from evaluation data improperly influences training, inflating reported performance.

Foundations & Theory
Data Augmentation Intermediate

Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.

Foundations & Theory
Synthetic Data Intermediate

Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.

Foundations & Theory
Data Governance Intermediate

Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.

Foundations & Theory
Data Lineage Intermediate

Tracking where data came from and how it was transformed; key for debugging and compliance.

Foundations & Theory
Data Poisoning Intermediate

Maliciously inserting or altering training data to implant backdoors or degrade performance.

Foundations & Theory
Data Scaling Intermediate

Increasing performance via more data.

AI Economics & Strategy
Probability Distribution Advanced

Describes likelihoods of random variable outcomes.

Probability & Statistics
Distribution Shift Intermediate

Train/test environment mismatch.

Model Failure Modes
Artificial Intelligence Intermediate

The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...

Foundations & Theory
Machine Learning Intermediate

A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.

Machine Learning
Supervised Learning Intermediate

Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.

Machine Learning
Self-Supervised Learning Intermediate

Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.

Machine Learning