Results for "compute-data-performance"
Running new model alongside production without user impact.
Incrementally deploying new models to reduce risk.
Cost of model training.
Declining differentiation among models.
The physical system being controlled.
Performance drop when moving from simulation to reality.
Ability to correctly detect disease.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.
Combining simulation and real-world data.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Running models locally.
Enables external computation or lookup.
Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
When a model cannot capture underlying structure, performing poorly on both training and test data.
How well a model performs on new data drawn from the same (or similar) distribution as training.
Selecting the most informative samples to label (e.g., uncertainty sampling) to reduce labeling cost.
A narrow minimum often associated with poorer generalization.
Neural networks that operate on graph-structured data by propagating information along edges.
Centralized repository for curated features.
Competitive advantage from proprietary models/data.
Train/test environment mismatch.
Model trained on its own outputs degrades quality.
Startup latency for services.
Finding mathematical equations from data.