Results for "parameter estimation"
Bayesian parameter estimation using the mode of the posterior distribution.
Probability of data given parameters.
Using same parameters across different parts of a model.
Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.
Measures how much information an observable random variable carries about unknown parameters.
Estimating parameters by maximizing likelihood of observed data.
Inferring the agent’s internal state from noisy sensor data.
Belief before observing data.
Methods like Adam adjusting learning rates dynamically.
Uses an exponential moving average of gradients to speed convergence and reduce oscillation.
Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.
Techniques that fine-tune small additional components rather than all weights to reduce compute and storage.
The shape of the loss function over parameter space.
A narrow minimum often associated with poorer generalization.
A wide basin often correlated with better generalization.
Updated belief after observing data.
Visualization of optimization landscape.
Systematic error introduced by simplifying assumptions in a learning algorithm.
Monte Carlo method for state estimation.
Approximating expectations via random sampling.
Learning physical parameters from data.
Halting training when validation performance stops improving to reduce overfitting.
A gradient method using random minibatches for efficient training on large datasets.
PEFT method injecting trainable low-rank matrices into layers, enabling efficient fine-tuning.
A point where gradient is zero but is neither a max nor min; common in deep nets.
Matrix of second derivatives describing local curvature of loss.
Matrix of curvature information.
Generates sequences one token at a time, conditioning on past tokens.
Combines value estimation (critic) with policy learning (actor).
Exact likelihood generative models using invertible transforms.