Search: LM quality metric

Benchmark Intermediate

A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.

Evaluation & Benchmarking

Dataset Intermediate

A structured collection of examples used to train/evaluate models; quality, bias, and coverage often dominate outcomes.

Machine Learning

Chunking Intermediate

Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.

Foundations & Theory

Data Labeling Intermediate

Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.

Foundations & Theory

Data Governance Intermediate

Processes and controls for data quality, access, lineage, retention, and compliance across the AI lifecycle.

Foundations & Theory

Monitoring Intermediate

Observing model inputs/outputs, latency, cost, and quality over time to catch regressions and drift.

MLOps & Infrastructure

Feedback Loop Collapse Intermediate

Model trained on its own outputs degrades quality.

Model Failure Modes

Results for "LM quality metric"