Results for "objective design"
Rules and controls around generation (filters, validators, structured outputs) to reduce unsafe or invalid behavior.
Optimization problems where any local minimum is global.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Optimization with multiple local minima/saddle points; typical in neural networks.
Optimizing policies directly via gradient ascent on expected reward.
Diffusion model trained to remove noise step by step.
Generative model that learns to reverse a gradual noise process.
Model that compresses input into latent space and reconstructs it.
Autoencoder using probabilistic latent variables and KL regularization.
Average value under a distribution.
Two-network setup where generator fools a discriminator.
Restricting updates to safe regions.
Optimization under equality/inequality constraints.
Ensuring learned behavior matches intended objective.
Learned subsystem that optimizes its own objective.
Optimizes future actions using a model of dynamics.
Optimal control for linear systems with quadratic cost.
Control that remains stable under model uncertainty.
Learning physical parameters from data.
Learning action mapping directly from demonstrations.
Tradeoff between safety and performance.
Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.
Learning only from current policy’s data.
Methods to set starting weights to preserve signal/gradient scales across layers.
The set of tokens a model can represent; impacts efficiency, multilinguality, and handling of rare strings.
Maximum number of tokens the model can attend to in one forward pass; constrains long-document reasoning.
A high-priority instruction layer setting overarching behavior constraints for a chat model.
Controlled experiment comparing variants by random assignment to estimate causal effects of changes.
Logging hyperparameters, code versions, data snapshots, and results to reproduce and compare experiments.
How many requests or tokens can be processed per unit time; affects scalability and cost.