Results for "rule-based"
A proper scoring rule measuring squared error of predicted probabilities for binary outcomes.
Simple agent responding directly to inputs.
A subfield of AI where models learn patterns from data to make predictions or decisions, improving with experience rather than explicit rule-coding.
Credit models with interpretable logic.
Iterative method that updates parameters in the direction of negative gradient to minimize loss.
Uses an exponential moving average of gradients to speed convergence and reduce oscillation.
A gradient method using random minibatches for efficient training on large datasets.
Controls the size of parameter updates; too high diverges, too low trains slowly or gets stuck.
Number of samples per gradient update; impacts compute efficiency, generalization, and stability.
Choosing step size along gradient direction.
Patient agreement to AI-assisted care.
RL using learned or known environment models.
Exact likelihood generative models using invertible transforms.
Combines value estimation (critic) with policy learning (actor).
Learns the score (∇ log p(x)) for generative sampling.
Dynamic resource allocation.
Continuous loop adjusting actions based on state feedback.
Algorithm computing control actions.
Architecture based on self-attention and feedforward layers; foundation of modern LLMs and many multimodal models.
Achieving task performance by providing a small number of examples inside the prompt without weight updates.
Retrieval based on embedding similarity rather than keyword overlap, capturing paraphrases and related concepts.
Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.
A preference-based training method optimizing policies directly from pairwise comparisons without explicit RL loops.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.
Identifying and localizing objects in images, often with confidence scores and bounding rectangles.
Continuous cycle of observation, reasoning, action, and feedback.
Generating speech audio from text, with control over prosody, speaker identity, and style.
Separates planning from execution in agent architectures.