Results for "matrix factorization"
Graphical model expressing factorization of a probability distribution.
A table summarizing classification outcomes, foundational for metrics like precision, recall, specificity.
Matrix of second derivatives describing local curvature of loss.
Decomposes a matrix into orthogonal components; used in embeddings and compression.
Number of linearly independent rows or columns.
Matrix of first-order derivatives for vector-valued functions.
Optimization using curvature information; often expensive at scale.
Optimal estimator for linear dynamic systems.
Sensitivity of a function to input perturbations.
Matrix of curvature information.
Optimal control for linear systems with quadratic cost.
A narrow minimum often associated with poorer generalization.
Mathematical foundation for ML involving vector spaces, matrices, and linear transformations.
Vector whose direction remains unchanged under linear transformation.
Of predicted positives, the fraction that are truly positive; sensitive to false positives.
Of true positives, the fraction correctly identified; sensitive to false negatives.
Of true negatives, the fraction correctly identified.
Attention where queries/keys/values come from the same sequence, enabling token-to-token interactions.
A broader capability to infer internal system state from telemetry, crucial for AI services and agents.
Raw model outputs before converting to probabilities; manipulated during decoding and calibration.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
A point where gradient is zero but is neither a max nor min; common in deep nets.
A wide basin often correlated with better generalization.
A narrow hidden layer forcing compact representations.
A single attention mechanism within multi-head attention.
Techniques to handle longer documents without quadratic cost.
Attention mechanisms that reduce quadratic complexity.
Neural networks that operate on graph-structured data by propagating information along edges.
Graphs containing multiple node or edge types with different semantics.
Flat high-dimensional regions slowing training.