Domain: Machine Learning
When some classes are rare, requiring reweighting, resampling, or specialized metrics.
A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.
A continuous vector encoding of an item (word, image, user) such that semantic similarity corresponds to geometric closeness.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.
Training one model on multiple tasks simultaneously to improve generalization through shared structure.
Learning where data arrives sequentially and the model updates continuously, often under changing distributions.
Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.
Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.
Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.