Search: adversarial probing

Red Teaming Intermediate

Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.

Security & Privacy

Model Stealing Intermediate

Reconstructing a model or its capabilities via API queries or leaked artifacts.

Foundations & Theory

Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory

Adversarial Market Advanced

Market reacting strategically to AI.

Agents & Autonomy

GAN Advanced

Two-network setup where generator fools a discriminator.

Diffusion & Generative Models

Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory

Bias Intermediate

Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.

Foundations & Theory

Secure Inference Intermediate

Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.

Foundations & Theory

Synthetic Data Intermediate

Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.

Foundations & Theory

Generative Model Advanced

Models that learn to generate samples resembling training data.

Diffusion & Generative Models

Backdoor / Trojan Intermediate

Hidden behavior activated by specific triggers, causing targeted mispredictions or undesired outputs.

Foundations & Theory

Mode Collapse Advanced

Generator produces limited variety of outputs.

Diffusion & Generative Models

Diffusion Model Advanced

Generative model that learns to reverse a gradual noise process.

Diffusion & Generative Models

Voice Conversion Intermediate

Changing speaker characteristics while preserving content.

Speech & Audio AI

Neural Vocoder Intermediate

Generates audio waveforms from spectrograms.

Speech & Audio AI

Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment

Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment

Spurious Correlation Intermediate

Model relies on irrelevant signals.

Model Failure Modes

Latent Dynamics Frontier

Modeling environment evolution in latent space.

World Models & Cognition

Algorithmic Bias Intermediate

Unequal performance across demographic groups.

AI in Healthcare

Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy

Results for "adversarial probing"

Welcome to AI Glossary

Search

Browse

3D WordGraph