Search: LM quality metric

F1 Score Intermediate

Harmonic mean of precision and recall; useful when balancing false positives/negatives matters.

Foundations & Theory

Data Governance Intermediate

Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.

Foundations & Theory

Log Loss Intermediate

Penalizes confident wrong predictions heavily; standard for classification and language modeling.

Optimization

Mean Squared Error Intermediate

Average of squared residuals; common regression objective.

Optimization

Dataset Intermediate

A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.

Machine Learning

Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory

Chunking Intermediate

Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.

Foundations & Theory

Reward Model Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Foundations & Theory

Monitoring Intermediate

Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.

MLOps & Infrastructure

Feedback Loop Collapse Intermediate

Model trained on its own outputs degrades quality.

Model Failure Modes

Cross-Validation Intermediate

A robust evaluation technique that trains/evaluates across multiple splits to estimate performance variability.

Foundations & Theory

Accuracy Intermediate

Fraction of correct predictions; can be misleading on imbalanced datasets.

Foundations & Theory

Precision Intermediate

Of predicted positives, the fraction that are truly positive; sensitive to false positives.

Foundations & Theory

Recall Intermediate

Of true positives, the fraction correctly identified; sensitive to false negatives.

Foundations & Theory

Specificity Intermediate

Of true negatives, the fraction correctly identified.

Foundations & Theory

AUC Intermediate

Scalar summary of ROC; measures ranking ability, not calibration.

Foundations & Theory

Brier Score Intermediate

A proper scoring rule measuring squared error of predicted probabilities for binary outcomes.

Evaluation & Benchmarking

Latency Intermediate

Time from request to response; critical for real-time inference and UX.

Foundations & Theory

Throughput Intermediate

How many requests or tokens can be processed per unit time; affects scalability and cost.

Transformers & LLMs

Benchmark Intermediate

A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.

Evaluation & Benchmarking

Information Gain Intermediate

Reduction in uncertainty achieved by observing a variable; used in decision trees and active learning.

AI Economics & Strategy

Sensitivity Intermediate

Ability to correctly detect disease.

AI in Healthcare

Norm Advanced

Measure of vector magnitude; used in regularization and optimization.

Mathematics

False Negative Intermediate

Failure to detect present disease.

AI in Healthcare

Feature Engineering Intermediate

Designing input features to expose useful structure (e.g., ratios, lags, aggregations), often crucial outside deep learning.

Foundations & Theory

Prompt Intermediate

The text (and possibly other modalities) given to an LLM to condition its output behavior.

Prompting & Instructions

RAG Intermediate

Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.

Foundations & Theory

Inter-Annotator Agreement Intermediate

Measure of consistency across labelers; low agreement indicates ambiguous tasks or poor guidelines.

Foundations & Theory

Datasheet for Datasets Intermediate

Structured dataset documentation covering collection, composition, recommended uses, biases, and maintenance.

Foundations & Theory

Data Lineage Intermediate

Tracking where data came from and how it was transformed; key for debugging and compliance.

Foundations & Theory

Results for "LM quality metric"

Welcome to AI Glossary

Search

Browse

3D WordGraph