Results for "state-action value"
Balancing learning new behaviors vs exploiting known rewards.
Learning policies from expert demonstrations.
Acting to minimize surprise or free energy.
What would have happened under different conditions.
A broader capability to infer internal system state from telemetry, crucial for AI services and agents.
Probabilistic model for sequential data with latent states.
Algorithm computing control actions.
Equations governing how system states change over time.
Methods for breaking goals into steps; can be classical (A*, STRIPS) or LLM-driven with tool calls.
System that independently pursues goals over time.
Average of squared residuals; common regression objective.
Iterative method that updates parameters in the direction of negative gradient to minimize loss.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Limiting gradient magnitude to prevent exploding gradients.
Models that define an energy landscape rather than explicit probabilities.
Decomposes a matrix into orthogonal components; used in embeddings and compression.
Attention between different modalities.
Average value under a distribution.
Agents optimize collective outcomes.
RL without explicit dynamics model.
An RNN variant using gates to mitigate vanishing gradients and capture longer context.
Simultaneous Localization and Mapping for robotics.
Temporary reasoning space (often hidden).
Internal sensing of joint positions, velocities, and forces.
Optimal control for linear systems with quadratic cost.
Understanding objects exist when unseen.
AI giving legal advice without authorization.
A system that perceives state, selects actions, and pursues goals—often combining LLM reasoning with tools and memory.
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Separates planning from execution in agent architectures.