Results for "“learning to learn”"
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.
A model is PAC-learnable if it can, with high probability, learn an approximately correct hypothesis from finite samples.
Models that learn to generate samples resembling training data.
Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.
Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.
Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.
A learning paradigm where an agent interacts with an environment and learns to choose actions to maximize cumulative reward.
Learning where data arrives sequentially and the model updates continuously, often under changing distributions.
Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.
Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Adjusting learning rate over training to improve convergence.
Learning from data generated by a different policy.
Learning only from current policy’s data.
Learning policies from expert demonstrations.
Learning without catastrophic forgetting.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
Systematic error introduced by simplifying assumptions in a learning algorithm.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.
Gradually increasing learning rate at training start to avoid divergence.
Built-in assumptions guiding learning efficiency and generalization.
Combines value estimation (critic) with policy learning (actor).
Balancing learning new behaviors vs exploiting known rewards.
Methods like Adam adjusting learning rates dynamically.
Loss of old knowledge when learning new tasks.
Learning physical parameters from data.
Modifying reward to accelerate learning.