Search: data distribution

Feature Engineering Intermediate

Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.

Foundations & Theory

Differential Privacy Intermediate

A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.

Security & Privacy

Scaling Laws Intermediate

Empirical laws linking model size, data, compute to performance.

AI Economics & Strategy

Time Series Intermediate

Sequential data indexed by time.

Time Series

Synthetic Sensors Advanced

Artificial sensor data generated in simulation.

Simulation & Sim-to-Real

Hybrid Training Advanced

Combining simulation and real-world data.

Simulation & Sim-to-Real

Machine Learning Intermediate

A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.

Machine Learning

Empirical Risk Minimization Intermediate

Minimizing average loss on training data; can overfit when data is limited or biased.

Optimization

Deep Learning Intermediate

A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.

Deep Learning

Train/Validation/Test Split Intermediate

Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.

Evaluation & Benchmarking

Representation Learning Intermediate

Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.

Machine Learning

Structured Output Intermediate

Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.

Foundations & Theory

Off-Policy Learning Intermediate

Learning from data generated by a different policy.

AI Economics & Strategy

Autoencoder Advanced

Model that compresses input into latent space and reconstructs it.

Diffusion & Generative Models

Simpson’s Paradox Advanced

Trend reversal when data is aggregated improperly.

Causal AI & Interpretability

Edge Inference Intermediate

Running models locally.

AI Economics & Strategy

Likelihood Function Advanced

Probability of data given parameters.

Probability & Statistics

Legal Hold Intermediate

Requirement to preserve relevant data.

AI in Law

Dataset Intermediate

A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.

Machine Learning

Feature Intermediate

A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.

Foundations & Theory

Model Intermediate

A parameterized mapping from inputs to outputs; includes architecture + learned parameters.

Foundations & Theory

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Underfitting Intermediate

When a model cannot capture underlying structure, performing poorly on both training and test data.

Foundations & Theory

Bias Intermediate

Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.

Foundations & Theory

Inter-Annotator Agreement Intermediate

Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.

Foundations & Theory

Active Learning Intermediate

Selecting the most informative samples to label (e.g., uncertainty sampling) to reduce labeling cost.

Foundations & Theory

Datasheet for Datasets Intermediate

Structured dataset documentation covering collection, composition, recommended uses, biases, and maintenance.

Foundations & Theory

Experiment Tracking Intermediate

Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.

Evaluation & Benchmarking

Audit Intermediate

Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.

Governance & Ethics

Privacy Attack Intermediate

Attacks that infer whether specific records were in training data, or reconstruct sensitive training examples.

Foundations & Theory

Results for "data distribution"

Welcome to AI Glossary

Search

Browse

3D WordGraph