Search: data distribution

Concept Drift Intermediate

The relationship between inputs and outputs changes over time, requiring monitoring and model updates.

Foundations & Theory

F1 Score Intermediate

Harmonic mean of precision and recall; useful when balancing false positives/negatives matters.

Foundations & Theory

AUC Intermediate

Scalar summary of ROC; measures ranking ability, not calibration.

Foundations & Theory

Weight Initialization Intermediate

Methods to set starting weights to preserve signal/gradient scales across layers.

Foundations & Theory

Language Model Intermediate

A model that assigns probabilities to sequences of tokens; often trained by next-token prediction.

Large Language Models

SHAP Intermediate

Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.

Foundations & Theory

Class Imbalance Intermediate

When some classes are rare, requiring reweighting, resampling, or specialized metrics.

Machine Learning

Top-k Intermediate

Samples from the k highest-probability tokens to limit unlikely outputs.

Foundations & Theory

Top-p Intermediate

Samples from the smallest set of tokens whose probabilities sum to p, adapting set size by context.

Foundations & Theory

Logits Intermediate

Raw model outputs before converting to probabilities; manipulated during decoding and calibration.

Foundations & Theory

Information Gain Intermediate

Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.

AI Economics & Strategy

Policy Intermediate

Strategy mapping states to actions.

AI Economics & Strategy

Expectation Advanced

Average value under a distribution.

Probability & Statistics

Monte Carlo Estimation Advanced

Approximating expectations via random sampling.

Probability & Statistics

Role Prompting Intro

Assigning a role or identity to the model.

Prompting & Instructions

Self-Consistency Intro

Sampling multiple outputs and selecting consensus.

Prompting & Instructions

Value at Risk Intermediate

Maximum expected loss under normal conditions.

AI Economics & Strategy

Pareto Optimality Advanced

No agent can improve without hurting another.

Agents & Autonomy

Market Design Advanced

Designing efficient marketplaces.

Agents & Autonomy

Data Scaling Intermediate

Increasing performance via more data.

AI Economics & Strategy

Encryption in Transit/At Rest Intermediate

Protecting data during network transfer and while stored; essential for ML pipelines handling sensitive data.

Security & Privacy

Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory

Data Leakage Intermediate

When information from evaluation data improperly influences training, inflating reported performance.

Foundations & Theory

Data Poisoning Intermediate

Maliciously inserting or altering training data to implant backdoors or degrade performance.

Foundations & Theory

Data Protection Impact Assessment Intermediate

Privacy risk analysis under GDPR-like laws.

Governance & Ethics

Unsupervised Learning Intermediate

Learning structure from unlabeled data, such as discovering groups, compressing representations, or modeling data distributions.

Machine Learning

Self-Supervised Learning Intermediate

Learning from data by constructing “pseudo-labels” (e.g., next-token prediction, masked modeling) without manual annotation.

Machine Learning

Overfitting Intermediate

When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.

Foundations & Theory

PII Intermediate

Information that can identify an individual (directly or indirectly); requires careful handling and compliance.

Foundations & Theory

Semi-Supervised Learning Intermediate

Training with a small labeled dataset plus a larger unlabeled dataset, leveraging assumptions like smoothness/cluster structure.

Machine Learning

Results for "data distribution"

Welcome to AI Glossary

Search

Browse

3D WordGraph