Results for "expert selection"
Learning policies from expert demonstrations.
Learning action mapping directly from demonstrations.
Inferring reward function from observed behavior.
Routes inputs to subsets of parameters for scalable capacity.
Chooses which experts process each token.
Belief before observing data.
A measurable property or attribute used as model input (raw or engineered), such as age, pixel intensity, or token ID.
Required descriptions of model behavior and limits.
Automated assistance identifying disease indicators.
A learning paradigm where an agent interacts with an environment and learns to choose actions to maximize cumulative reward.
Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.
A parameterized mapping from inputs to outputs; includes architecture + learned parameters.
Configuration choices not learned directly (or not typically learned) that govern training or architecture.
Scalar summary of ROC; measures ranking ability, not calibration.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Selecting the most informative samples to label (e.g., uncertainty sampling) to reduce labeling cost.
Central system to store model versions, metadata, approvals, and deployment state.
Stochastic generation strategies that trade determinism for diversity; key knobs include temperature and nucleus sampling.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Error due to sensitivity to fluctuations in the training dataset.
A measure of randomness or uncertainty in a probability distribution.
Quantifies shared information between random variables.
A narrow minimum often associated with poorer generalization.
Probability of treatment assignment given covariates.
End-to-end process for model training.
Normalized covariance.
Model relies on irrelevant signals.
AI selecting next experiments.
Some agents know more than others.