Results for "data distribution"
Methods to set starting weights to preserve signal/gradient scales across layers.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.
When some classes are rare, requiring reweighting, resampling, or specialized metrics.
Samples from the k highest-probability tokens to limit unlikely outputs.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.
Strategy mapping states to actions.
Average value under a distribution.
Approximating expectations via random sampling.
Assigning a role or identity to the model.
Sampling multiple outputs and selecting consensus.
Maximum expected loss under normal conditions.
No agent can improve without hurting another.
Designing efficient marketplaces.
Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.
Tracking where data came from and how it was transformed; key for debugging and compliance.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
Increasing performance via more data.
Protecting data during network transfer and while stored; essential for ML pipelines handling sensitive data.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
When information from evaluation data improperly influences training, inflating reported performance.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Privacy risk analysis under GDPR-like laws.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
Information that can identify an individual (directly or indirectly); requires careful handling and compliance.
Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.