Results for "second derivatives"
Matrix of second derivatives describing local curvature of loss.
Matrix of curvature information.
Optimization using curvature information; often expensive at scale.
Matrix of first-order derivatives for vector-valued functions.
A narrow minimum often associated with poorer generalization.
Popular optimizer combining momentum and per-parameter adaptive step sizes via first/second moment estimates.
How many requests or tokens can be processed per unit time; affects scalability and cost.
Increasing model capacity via compute.
Maximum system processing rate.
Direction of steepest ascent of a function.
Gradients shrink through layers, slowing learning in early layers; mitigated by ReLU, residuals, normalization.
Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.
Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.
Measures how one probability distribution diverges from another.
Cost to run models in production.
Automated assistance identifying disease indicators.
Rules governing auctions.
Truthful bidding is optimal strategy.