Results for "model-based"
Model-Based RL
AdvancedRL using learned or known environment models.
Model-based reinforcement learning is like having a map while exploring a new city. Instead of wandering around aimlessly, you can look at the map to plan your route and make better decisions about where to go next. In this type of learning, an AI agent first learns how the environment works—like...
Built-in assumptions guiding learning efficiency and generalization.
Extracting system prompts or hidden instructions.
Using production outcomes to improve models.
Model relies on irrelevant signals.
Learning physical parameters from data.
Fast approximation of costly simulations.
Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.
Models that process or generate multiple modalities, enabling vision-language tasks, speech, video understanding, etc.
A single attention mechanism within multi-head attention.
Routes inputs to subsets of parameters for scalable capacity.
Models trained to decide when to call tools.
Joint vision-language model aligning images and text.
Temporary reasoning space (often hidden).
Probabilities do not reflect true correctness.
Restricting distribution of powerful models.
One complete traversal of the training dataset during training.
Injects sequence order into Transformers, since attention alone is permutation-invariant.
Letting an LLM call external functions/APIs to fetch data, compute, or take actions, improving reliability.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Constraining model outputs into a schema used to call external APIs/tools safely and deterministically.
Prevents attention to future tokens during training/inference.
Encodes token position explicitly, often via sinusoids.
GNN using attention to weight neighbor contributions dynamically.
Assigning category labels to images.
Explicit output constraints (format, tone).
Fabrication of cases or statutes by LLMs.
Finding mathematical equations from data.
Emergence of conventions among agents.
A measure of a model class’s expressive capacity based on its ability to shatter datasets.
Using markers to isolate context segments.