Results for "overall correctness"
Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.
Coordinating tools, models, and steps (retrieval, calls, validation) to deliver reliable end-to-end behavior.
Probabilities do not reflect true correctness.
Mathematical guarantees of system behavior.
Scalar summary of ROC; measures ranking ability, not calibration.
Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.
Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.
Techniques to understand model decisions (global or local), important in high-stakes and regulated settings.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
Tracking where data came from and how it was transformed; key for debugging and compliance.
Structured dataset documentation covering collection, composition, recommended uses, biases, and maintenance.
Stochastic generation strategies that trade determinism for diversity; key knobs include temperature and nucleus sampling.
Scales logits before sampling; higher increases randomness/diversity, lower increases determinism.
Controls amount of noise added at each diffusion step.
Maps audio signals to linguistic units.
Minimum relative to nearby points.
Detecting and avoiding obstacles.
Supplying buy/sell orders.
Designing systems where rational agents behave as desired.
Goals useful regardless of final objective.
Accelerating safety relative to capabilities.
Trend reversal when data is aggregated improperly.