Results for "model-based"
Model-Based RL
AdvancedRL using learned or known environment models.
Model-based reinforcement learning is like having a map while exploring a new city. Instead of wandering around aimlessly, you can look at the map to plan your route and make better decisions about where to go next. In this type of learning, an AI agent first learns how the environment works—like...
Of true negatives, the fraction correctly identified.
Scalar summary of ROC; measures ranking ability, not calibration.
Average of squared residuals; common regression objective.
A gradient method using random minibatches for efficient training on large datasets.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Halting training when validation performance stops improving to reduce overfitting.
A parameterized function composed of interconnected units organized in layers with nonlinear activations.
Nonlinear functions enabling networks to approximate complex mappings; ReLU variants dominate modern DL.
Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.
Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.
Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.
Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.
Ability to replicate results given same code/data; harder in distributed training and nondeterministic ops.
Time from request to response; critical for real-time inference and UX.
How many requests or tokens can be processed per unit time; affects scalability and cost.
Scales logits before sampling; higher increases randomness/diversity, lower increases determinism.
Samples from the k highest-probability tokens to limit unlikely outputs.
A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.
A discipline ensuring AI systems are fair, safe, transparent, privacy-preserving, and accountable throughout lifecycle.
Converting audio speech into text, often using encoder-decoder or transducer architectures.
Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.
A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.
Measures how one probability distribution diverges from another.
Estimating parameters by maximizing likelihood of observed data.
Optimization using curvature information; often expensive at scale.
Matrix of second derivatives describing local curvature of loss.
A narrow hidden layer forcing compact representations.
Tradeoffs between many layers vs many neurons per layer.
Models evaluating and improving their own outputs.
Recovering training data from gradients.