Results for "interpretability"

Interpretability

Intermediate

Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).

Interpretability in AI is like being able to see the recipe behind a dish. It helps us understand how different ingredients (or features) affect the final outcome (or prediction) of a model. For example, if an AI predicts that someone will get a job, interpretability techniques can show which fac...

AdvertisementAd space — search-top

20 results

Interpretability Intermediate

Studying internal mechanisms or input influence on outputs (e.g., saliency maps, SHAP, attention analysis).

Foundations & Theory
Explainability Intermediate

Techniques to understand model decisions (global or local), important in high-stakes and regulated settings.

Foundations & Theory
Accuracy Intermediate

Fraction of correct predictions; can be misleading on imbalanced datasets.

Foundations & Theory
Chain-of-Thought Intermediate

Stepwise reasoning patterns that can improve multi-step tasks; often handled implicitly or summarized for safety/privacy.

Foundations & Theory
SHAP Intermediate

Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.

Foundations & Theory
Causal Inference Intermediate

Framework for reasoning about cause-effect relationships beyond correlation, often using structural assumptions and experiments.

Foundations & Theory
Backdoor / Trojan Intermediate

Hidden behavior activated by specific triggers, causing targeted mispredictions or undesired outputs.

Foundations & Theory
Mutual Information Intermediate

Quantifies shared information between random variables.

AI Economics & Strategy
Human Oversight Intermediate

Required human review for high-risk decisions.

AI Economics & Strategy
Counterfactual Advanced

What would have happened under different conditions.

Causal AI & Interpretability
Do-Operator Advanced

Models effects of interventions (do(X=x)).

Causal AI & Interpretability
Propensity Score Advanced

Probability of treatment assignment given covariates.

Causal AI & Interpretability
Inner Alignment Advanced

Ensuring learned behavior matches intended objective.

AI Safety & Alignment
Spurious Correlation Intermediate

Model relies on irrelevant signals.

Model Failure Modes
Explainability Mandate Intermediate

Requirement to provide explanations.

Governance & Ethics
AI Hallucination Intermediate

Fabrication of cases or statutes by LLMs.

AI in Law
Model Disclosure Intermediate

Requirement to reveal AI usage in legal decisions.

AI in Law
Explainable Credit Model Intermediate

Credit models with interpretable logic.

AI Economics & Strategy
Alignment Research Intermediate

Research ensuring AI remains safe.

Governance & Ethics
Simpson’s Paradox Advanced

Trend reversal when data is aggregated improperly.

Causal AI & Interpretability

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.