Results for "transition function"
Visualization of optimization landscape.
Choosing step size along gradient direction.
Stability proven via monotonic decrease of Lyapunov function.
Modifying reward to accelerate learning.
A learning paradigm where an agent interacts with an environment and learns to choose actions to maximize cumulative reward.
Activation max(0, x); improves gradient flow and training speed in deep nets.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Optimization problems where any local minimum is global.
Estimating parameters by maximizing likelihood of observed data.
A narrow minimum often associated with poorer generalization.
Optimizing policies directly via gradient ascent on expected reward.
Models that define an energy landscape rather than explicit probabilities.
Average value under a distribution.
Minimum relative to nearby points.
Optimization under equality/inequality constraints.
Converts constrained problem to unconstrained form.
Alternative formulation providing bounds.
Optimization under uncertainty.
Finding control policies minimizing cumulative cost.
Optimizes future actions using a model of dynamics.
Optimal control for linear systems with quadratic cost.
Learning policies from expert demonstrations.
Fast approximation of costly simulations.
Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.
Learning where data arrives sequentially and the model updates continuously, often under changing distributions.
Training one model on multiple tasks simultaneously to improve generalization through shared structure.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
A continuous vector encoding of an item (word, image, user) such that semantic similarity corresponds to geometric closeness.
Automatically learning useful internal features (latent variables) that capture salient structure for downstream tasks.
Configuration choices not learned directly (or not typically learned) that govern training or architecture.