Results for "step size"
Temporary reasoning space (often hidden).
US approval process for medical AI devices.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.
Configuration choices not learned directly (or not typically learned) that govern training or architecture.
Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.
Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.
Attention mechanisms that reduce quadratic complexity.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Transformer applied to image patches.
Estimating parameters by maximizing likelihood of observed data.
Classical statistical time-series model.
Scaling law optimizing compute vs data.
Cost of model training.
Measure of vector magnitude; used in regularization and optimization.
Failure to detect present disease.
Effect of trades on prices.
Measures a model’s ability to fit random noise; used to bound generalization error.