Results for "probability correctness"
A proper scoring rule measuring squared error of predicted probabilities for binary outcomes.
Randomly zeroing activations during training to reduce co-adaptation and overfitting.
Training objective where the model predicts the next token given previous tokens (causal modeling).
Predicts masked tokens in a sequence, enabling bidirectional context; often used for embeddings rather than generation.
Model-generated content that is fluent but unsupported by evidence or incorrect; mitigated by grounding and verification.
Expanding training data via transformations (flips, noise, paraphrases) to improve robustness.
Artificially created data used to train/test models; helpful for privacy and coverage, risky if unrealistic.
Scales logits before sampling; higher increases randomness/diversity, lower increases determinism.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
Exponential of average negative log-likelihood; lower means better predictive fit, not necessarily better utility.
Strategy mapping states to actions.
A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.
Fundamental recursive relationship defining optimal value functions.
Formal framework for sequential decision-making under uncertainty.
Probabilistic model for sequential data with latent states.
Expected return of taking action in a state.
Models that learn to generate samples resembling training data.
Generative model that learns to reverse a gradual noise process.
Learns the score (∇ log p(x)) for generative sampling.
Autoencoder using probabilistic latent variables and KL regularization.
Simultaneous Localization and Mapping for robotics.
Predicting future values from past observations.
Identifying abrupt changes in data generation.
Measure of spread around the mean.
Eliminating variables by integrating over them.
Optimization under uncertainty.
Assigning a role or identity to the model.
Sampling multiple outputs and selecting consensus.
Central log of AI-related risks.