Search: data distribution

Online Learning Intermediate

Learning where data arrives sequentially and the model updates continuously, often under changing distributions.

Machine Learning

Meta-Learning Intermediate

Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.

Machine Learning

Domain Shift Intermediate

A mismatch between training and deployment data distributions that can degrade model performance.

MLOps & Infrastructure

Objective Function Intermediate

A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.

Optimization

Underfitting Intermediate

When a model cannot capture underlying structure, performing poorly on both training and test data.

Foundations & Theory

Bias–Variance Tradeoff Intermediate

A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).

Foundations & Theory

Train/Validation/Test Split Intermediate

Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.

Evaluation & Benchmarking

Tool Use Intermediate

Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.

Agents & Autonomy

Fine-Tuning Intermediate

Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.

Large Language Models

RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Optimization

Bias Intermediate

Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.

Foundations & Theory

Differential Privacy Intermediate

A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.

Security & Privacy

Model Card Intermediate

Standardized documentation describing intended use, performance, limitations, data, and ethical considerations.

Foundations & Theory

Audit Intermediate

Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.

Governance & Ethics

CI/CD for ML Intermediate

Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.

MLOps & Infrastructure

Experiment Tracking Intermediate

Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.

Evaluation & Benchmarking

Reproducibility Intermediate

Ability to replicate results given same code/data; harder in distributed training and nondeterministic ops.

Foundations & Theory

Prompt Injection Intermediate

Attacks that manipulate model instructions (especially via retrieved content) to override system goals or exfiltrate data.

Foundations & Theory

Privacy Attack Intermediate

Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.

Foundations & Theory

Secure Inference Intermediate

Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.

Foundations & Theory

Maximum Likelihood Estimation Intermediate

Estimating parameters by maximizing likelihood of observed data.

AI Economics & Strategy

Scaling Laws Intermediate

Empirical laws linking model size, data, compute to performance.

AI Economics & Strategy

Off-Policy Learning Intermediate

Learning from data generated by a different policy.

AI Economics & Strategy

On-Policy Learning Intermediate

Learning only from current policy’s data.

AI Economics & Strategy

Gradient Leakage Intermediate

Recovering training data from gradients.

AI Economics & Strategy

Model Inversion Intermediate

Inferring sensitive features of training data.

AI Economics & Strategy

Canary Tokens Intermediate

Detecting unauthorized model outputs or data leaks.

AI Economics & Strategy

Graph Neural Network Intermediate

Neural networks that operate on graph-structured data by propagating information along edges.

Model Architectures

Hidden Markov Model Intermediate

Probabilistic model for sequential data with latent states.

Model Architectures

Generative Model Advanced

Models that learn to generate samples resembling training data.

Diffusion & Generative Models

Results for "data distribution"