Results for "local approximation"
Neural networks can approximate any continuous function under certain conditions.
Optimization under uncertainty.
Minimum relative to nearby points.
Fast approximation of costly simulations.
Matrix of second derivatives describing local curvature of loss.
Sum of independent variables converges to normal distribution.
Techniques to understand model decisions (global or local), important in high-stakes and regulated settings.
Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).
Local surrogate explanation method approximating model behavior near a specific input.
Optimization problems where any local minimum is global.
Optimization with multiple local minima/saddle points; typical in neural networks.
A point where gradient is zero but is neither a max nor min; common in deep nets.
Extension of convolution to graph domains using adjacency structure.
Flat high-dimensional regions slowing training.
Lowest possible loss.
Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.
A gradient method using random minibatches for efficient training on large datasets.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Networks using convolution operations with weight sharing and locality, effective for images and signals.
Mechanism that computes context-aware mixtures of representations; scales well and captures long-range dependencies.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
Variability introduced by minibatch sampling during SGD.
The shape of the loss function over parameter space.
A narrow minimum often associated with poorer generalization.
A wide basin often correlated with better generalization.
Optimization using curvature information; often expensive at scale.
Attention mechanisms that reduce quadratic complexity.
Balancing learning new behaviors vs exploiting known rewards.