Results for "energy function"
Optimization using curvature information; often expensive at scale.
Early architecture using learned gates for skip connections.
Routes inputs to subsets of parameters for scalable capacity.
Empirical laws linking model size, data, compute to performance.
Chooses which experts process each token.
Fundamental recursive relationship defining optimal value functions.
Embedding signals to prove model ownership.
Detecting unauthorized model outputs or data leaks.
Neural networks that operate on graph-structured data by propagating information along edges.
GNN framework where nodes iteratively exchange and aggregate messages from neighbors.
Graphical model expressing factorization of a probability distribution.
Diffusion model trained to remove noise step by step.
Generative model that learns to reverse a gradual noise process.
Controls amount of noise added at each diffusion step.
Two-network setup where generator fools a discriminator.
Generator produces limited variety of outputs.
Assigning category labels to images.
Pixel-wise classification of image regions.
Joint vision-language model aligning images and text.
Persistent directional movement over time.
Running predictions on large datasets periodically.
Low-latency prediction per request.
Incrementally deploying new models to reduce risk.
Maintaining two environments for instant rollback.
Centralized repository for curated features.
Scaling law optimizing compute vs data.
Using production outcomes to improve models.
Sensitivity of a function to input perturbations.
Measure of vector magnitude; used in regularization and optimization.
Updated belief after observing data.