Results for "policy consistency"

AdvertisementAd space — search-top

42 results

On-Policy Learning Intermediate

Learning only from current policy’s data.

AI Economics & Strategy
Policy Search Advanced

Directly optimizing control policies.

Reinforcement Learning
Off-Policy Learning Intermediate

Learning from data generated by a different policy.

AI Economics & Strategy
Policy Gradient Intermediate

Optimizing policies directly via gradient ascent on expected reward.

AI Economics & Strategy
Policy Intermediate

Strategy mapping states to actions.

AI Economics & Strategy
Actor-Critic Intermediate

Combines value estimation (critic) with policy learning (actor).

AI Economics & Strategy
Self-Consistency Intro

Sampling multiple outputs and selecting consensus.

Prompting & Instructions
Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory
Inter-Annotator Agreement Intermediate

Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.

Foundations & Theory
Agent Loop Intermediate

Continuous cycle of observation, reasoning, action, and feedback.

AI Economics & Strategy
Reinforcement Learning Intermediate

A learning paradigm where an agent interacts with an environment and learns to choose actions to maximize cumulative reward.

Reinforcement Learning
RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Optimization
Guardrails Intermediate

Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.

Reinforcement Learning
Red Teaming Intermediate

Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.

Security & Privacy
Value Function Intermediate

Expected cumulative reward from a state or state-action pair.

AI Economics & Strategy
Planner-Executor Intermediate

Separates planning from execution in agent architectures.

AI Economics & Strategy
Explainability Requirement Intermediate

Legal or policy requirement to explain AI decisions.

AI Economics & Strategy
Controller Intermediate

Algorithm computing control actions.

Foundations & Theory
Model-Free RL Advanced

RL without explicit dynamics model.

Reinforcement Learning
Behavior Cloning Advanced

Learning action mapping directly from demonstrations.

Reinforcement Learning
Semi-Supervised Learning Intermediate

Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.

Machine Learning
SHAP Intermediate

Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.

Foundations & Theory
Model Registry Intermediate

Central system to store model versions, metadata, approvals, and deployment state.

Foundations & Theory
Reproducibility Intermediate

Ability to replicate results given same code/data; harder in distributed training and nondeterministic ops.

Foundations & Theory
Structured Output Intermediate

Forcing predictable formats for downstream systems; reduces parsing errors and supports validation/guardrails.

Foundations & Theory
Feature Store Intermediate

Centralized repository for curated features.

MLOps & Infrastructure
Maximum Likelihood Estimation Intermediate

Estimating parameters by maximizing likelihood of observed data.

AI Economics & Strategy
Sentencing Algorithm Intermediate

Models estimating recidivism risk.

AI in Law
System Prompt Intermediate

A high-priority instruction layer setting overarching behavior constraints for a chat model.

Reinforcement Learning
Reward Model Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Foundations & Theory

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.