Results for "out-of-sample performance"
Cost of model training.
Declining differentiation among models.
Required descriptions of model behavior and limits.
Maximum system processing rate.
Dynamic resource allocation.
The physical system being controlled.
Performance drop when moving from simulation to reality.
Ability to correctly detect disease.
Unequal performance across demographic groups.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Learning a function from input-output pairs (labeled data), optimizing performance on predicting outputs for unseen inputs.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.
Reusing knowledge from a source task/domain to improve learning on a target task/domain, typically via pretrained models.
Training one model on multiple tasks simultaneously to improve generalization through shared structure.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
Configuration choices not learned directly (or not typically learned) that govern training or architecture.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
A scalar measure optimized during training, typically expected loss over data, sometimes with regularization terms.
Techniques that discourage overly complex solutions to improve generalization (reduce overfitting).
When a model cannot capture underlying structure, performing poorly on both training and test data.
How well a model performs on new data drawn from the same (or similar) distribution as training.
Of predicted positives, the fraction that are truly positive; sensitive to false positives.
Of true positives, the fraction correctly identified; sensitive to false negatives.
Of true negatives, the fraction correctly identified.
Penalizes confident wrong predictions heavily; standard for classification and language modeling.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Methods to set starting weights to preserve signal/gradient scales across layers.