Domain: Foundations & Theory
Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
When information from evaluation data improperly influences training, inflating reported performance.
Tracking where data came from and how it was transformed; key for debugging and compliance.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Structured dataset documentation covering collection, composition, recommended uses, biases, and maintenance.
Training a smaller “student” model to mimic a larger “teacher,” often improving efficiency while retaining performance.
Randomly zeroing activations during training to reduce co-adaptation and overfitting.
Alternative formulation providing bounds.
Halting training when validation performance stops improving to reduce overfitting.
One complete traversal of the training dataset during training.
System for running consistent evaluations across tasks, versions, prompts, and model settings.
Techniques to understand model decisions (global or local), important in high-stakes and regulated settings.
Gradients grow too large, causing divergence; mitigated by clipping, normalization, careful init.
Harmonic mean of precision and recall; useful when balancing false positives/negatives matters.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
Using output to adjust future inputs.
Achieving task performance by providing a small number of examples inside the prompt without weight updates.
Constraining model outputs into a schema used to call external APIs/tools safely and deterministically.
How well a model performs on new data drawn from the same (or similar) distribution as training.
Lowest possible loss.
Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.
System design where humans validate or guide model outputs, especially for high-stakes decisions.
Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.
Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).
Converts constrained problem to unconstrained form.
Time from request to response; critical for real-time inference and UX.
The internal space where learned representations live; operations here often correlate with semantics or generative factors.