Results for "complexity"
Rademacher Complexity
IntermediateMeasures a model’s ability to fit random noise; used to bound generalization error.
Rademacher Complexity is a way to measure how well a learning model can adapt to random patterns in data. Imagine you have a set of points and you randomly assign labels to them, like flipping a coin for each point. Rademacher Complexity helps us understand how well a model can fit those random l...
Attention mechanisms that reduce quadratic complexity.
Measures a model’s ability to fit random noise; used to bound generalization error.
When a model fits noise/idiosyncrasies of training data and performs poorly on unseen data.
A measure of a model class’s expressive capacity based on its ability to shatter datasets.
A theoretical framework analyzing what classes of functions can be learned, how efficiently, and with what guarantees.
When a model cannot capture underlying structure, performing poorly on both training and test data.
A datastore optimized for similarity search over embeddings, enabling semantic retrieval at scale.
Hardware resources used for training/inference; constrained by memory bandwidth, FLOPs, and parallelism.
A model is PAC-learnable if it can, with high probability, learn an approximately correct hypothesis from finite samples.
Error due to sensitivity to fluctuations in the training dataset.
Optimization with multiple local minima/saddle points; typical in neural networks.
A wide basin often correlated with better generalization.
Matrix of second derivatives describing local curvature of loss.
The range of functions a model can represent.
Tradeoffs between many layers vs many neurons per layer.
Stores past attention states to speed up autoregressive decoding.
Techniques to handle longer documents without quadratic cost.
All possible configurations an agent may encounter.
Set of all actions available to the agent.
Decomposing goals into sub-tasks.
Simple agent responding directly to inputs.
Cost to run models in production.
Cost of model training.
Model optimizes objectives misaligned with human values.
Measure of vector magnitude; used in regularization and optimization.
Classifying models by impact level.
Maximum system processing rate.
Mathematical guarantees of system behavior.
AI-assisted review of legal documents.
Requirement to reveal AI usage in legal decisions.