Results for "model-based"
Model-Based RL
AdvancedRL using learned or known environment models.
Model-based reinforcement learning is like having a map while exploring a new city. Instead of wandering around aimlessly, you can look at the map to plan your route and make better decisions about where to go next. In this type of learning, an AI agent first learns how the environment works—like...
A narrow minimum often associated with poorer generalization.
Gradually increasing learning rate at training start to avoid divergence.
Stores past attention states to speed up autoregressive decoding.
Encodes positional information via rotation in embedding space.
Empirical laws linking model size, data, compute to performance.
Legal or policy requirement to explain AI decisions.
Graphical model expressing factorization of a probability distribution.
Models that learn to generate samples resembling training data.
Aligns transcripts with audio timestamps.
Models time evolution via hidden states.
Shift in feature distribution over time.
Maintaining alignment under new conditions.
Using limited human feedback to guide large models.
Requirement to provide explanations.
Maximum system processing rate.
Startup latency for services.
High-fidelity virtual model of a physical system.
Combining simulation and real-world data.
Predicting disease progression or survival.
Mechanics of price formation.
Quantifying financial risk.
Groups adopting extreme positions.
A conceptual framework describing error as the sum of systematic error (bias) and sensitivity to data (variance).
Measures a model’s ability to fit random noise; used to bound generalization error.
Agents copy others’ actions.
Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.
Methods that learn training procedures or initializations so models can adapt quickly to new tasks with little data.
A branch of ML using multi-layer neural networks to learn hierarchical representations, often excelling in vision, speech, and language.
Of predicted positives, the fraction that are truly positive; sensitive to false positives.
Of true positives, the fraction correctly identified; sensitive to false negatives.