Search: human values — AI Glossary

Value Misalignment Advanced

Model optimizes objectives misaligned with human values.

AI Safety & Alignment

Parameters Intermediate

The learned numeric values of a model adjusted during training to minimize a loss function.

Foundations & Theory

Self-Attention Intermediate

Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.

Transformers & LLMs

Forecasting Intermediate

Predicting future values from past observations.

Time Series

Random Variable Advanced

Variable whose values depend on chance.

Probability & Statistics

Human Oversight Intermediate

Required human review for high-risk decisions.

AI Economics & Strategy

Artificial Intelligence Intermediate

The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...

Foundations & Theory

RLHF Intermediate

Reinforcement learning from human feedback: uses preference data to train a reward model and optimize the policy.

Optimization

Reward Model Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Foundations & Theory

Alignment Intermediate

Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.

Foundations & Theory

Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory

NLP Intermediate

AI subfield dealing with understanding and generating human language, including syntax, semantics, and pragmatics.

Foundations & Theory

Speech Synthesis Intermediate

Generating human-like speech from text.

Speech & Audio AI

Alignment Problem Advanced

Ensuring AI systems pursue intended human goals.

AI Safety & Alignment

Scalable Oversight Advanced

Using limited human feedback to guide large models.

AI Safety & Alignment

Commonsense Physics Frontier

Human-like understanding of physical behavior.

World Models & Cognition

Teleoperation Frontier

Human controlling robot remotely.

World Models & Cognition

Shared Autonomy Frontier

Control shared between human and agent.

World Models & Cognition

Intent Recognition Frontier

Inferring human goals from behavior.

World Models & Cognition

Gesture Recognition Frontier

Interpreting human gestures.

World Models & Cognition

Value Learning Intermediate

Inferring and aligning with human preferences.

Governance & Ethics

Human-in-the-Loop Intermediate

System design where humans validate or guide model outputs, especially for high-stakes decisions.

Foundations & Theory

Human-in-the-Loop Control Frontier

Humans assist or override autonomous behavior.

World Models & Cognition

Results for "human values"