Results for "adversarial"

Adversarial Example

Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

An adversarial example is like a trick question designed to confuse a machine learning model. Imagine you have a smart assistant that can recognize pictures of cats and dogs. If someone changes a picture of a cat just a little bit, the assistant might get confused and think it's a dog instead. Th...

Full Definition View in 3D WordGraph

23 results

Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory

Red Teaming Intermediate

Stress-testing models for failures, vulnerabilities, policy violations, and harmful behaviors before release.

Security & Privacy

Adversarial Market Advanced

Market reacting strategically to AI.

Agents & Autonomy

Two-network setup where generator fools a discriminator.

Diffusion & Generative Models

Latent Space Intermediate

The internal space where learned representations live; operations here often correlate with semantics or generative factors.

Foundations & Theory

Bias Intermediate

Systematic differences in model outcomes across groups; arises from data, labels, and deployment context.

Foundations & Theory

Synthetic Data Intermediate

Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.

Foundations & Theory

Model Stealing Intermediate

Reconstructing a model or its capabilities via API queries or leaked artifacts.

Foundations & Theory

Backdoor / Trojan Intermediate

Hidden behavior activated by specific triggers, causing targeted mispredictions or undesired outputs.

Foundations & Theory

Secure Inference Intermediate

Methods to protect model/data during inference (e.g., trusted execution environments) from operators/attackers.

Foundations & Theory

Generative Model Advanced

Models that learn to generate samples resembling training data.

Diffusion & Generative Models

Mode Collapse Advanced

Generator produces limited variety of outputs.

Diffusion & Generative Models

Diffusion Model Advanced

Generative model that learns to reverse a gradual noise process.

Diffusion & Generative Models

Voice Conversion Intermediate

Changing speaker characteristics while preserving content.

Speech & Audio AI

Neural Vocoder Intermediate

Generates audio waveforms from spectrograms.

Speech & Audio AI

Specification Gaming Advanced

Model exploits poorly specified objectives.

AI Safety & Alignment

Reward Hacking Advanced

Maximizing reward without fulfilling real goal.

AI Safety & Alignment

Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment

Robust Alignment Advanced

Maintaining alignment under new conditions.

AI Safety & Alignment

Spurious Correlation Intermediate

Model relies on irrelevant signals.

Model Failure Modes

Latent Dynamics Frontier

Modeling environment evolution in latent space.

World Models & Cognition

Algorithmic Bias Intermediate

Unequal performance across demographic groups.

AI in Healthcare

Competitive Game Advanced

Agents have opposing objectives.

Agents & Autonomy