Results for "sensitivity to data"
Privacy risk analysis under GDPR-like laws.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
Information that can identify an individual (directly or indirectly); requires careful handling and compliance.
Inferring sensitive features of training data.
Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.
A mismatch between training and deployment data distributions that can degrade model performance.
Learning where data arrives sequentially and the model updates continuously, often under changing distributions.
The internal space where learned representations live; operations here often correlate with semantics or generative factors.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
Empirical laws linking model size, data, compute to performance.
Recovering training data from gradients.
Diffusion model trained to remove noise step by step.
Generative model that learns to reverse a gradual noise process.
Diffusion performed in latent space for efficiency.
Sequential data indexed by time.
Artificial sensor data generated in simulation.
Combining simulation and real-world data.
A formal privacy framework ensuring outputs do not reveal much about any single individual’s data contribution.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Minimizing average loss on training data; can overfit when data is limited or biased.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Separating data into training (fit), validation (tune), and test (final estimate) to avoid leakage and optimism bias.
Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.
Learning from data generated by a different policy.
Models that learn to generate samples resembling training data.
Model that compresses input into latent space and reconstructs it.
Shift in model outputs.