Results for "probabilistic accuracy"
Software pipeline converting raw sensor data into structured representations.
External sensing of surroundings (vision, audio, lidar).
Sampling-based motion planner.
Modeling environment evolution in latent space.
Understanding objects exist when unseen.
Human-like understanding of physical behavior.
Inferring human goals from behavior.
Controlling robots via language.
Systems where failure causes physical harm.
Predicting case success probabilities.
AI proposing scientific hypotheses.
The field of building systems that perform tasks associated with human intelligence—perception, reasoning, language, planning, and decision-making—via algori...
Risk threatening humanity’s survival.
Inferring the agent’s internal state from noisy sensor data.
The relationship between inputs and outputs changes over time, requiring monitoring and model updates.
How well a model performs on new data drawn from the same (or similar) distribution as training.
When information from evaluation data improperly influences training, inflating reported performance.
Often more informative than ROC on imbalanced datasets; focuses on positive class performance.
Average of squared residuals; common regression objective.
Halting training when validation performance stops improving to reduce overfitting.
Stepwise reasoning patterns that can improve multi-step tasks; often handled implicitly or summarized for safety/privacy.
Architecture that retrieves relevant documents (e.g., from a vector DB) and conditions generation on them to reduce hallucinations.
Breaking documents into pieces for retrieval; chunk size/overlap strongly affect RAG quality.
Feature attribution method grounded in cooperative game theory for explaining predictions in tabular settings.
Human or automated process of assigning targets; quality, consistency, and guidelines matter heavily.
When some classes are rare, requiring reweighting, resampling, or specialized metrics.
Automated testing and deployment processes for models and data workflows, extending DevOps to ML artifacts.
Systematic review of model/data processes to ensure performance, fairness, security, and policy compliance.
A dataset + metric suite for comparing models; can be gamed or misaligned with real-world goals.
Maliciously inserting or altering training data to implant backdoors or degrade performance.