Results for "direct preference optimization"
Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.
Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.
Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.
Ordering training samples from easier to harder to improve convergence or generalization.
Training across many devices/silos without centralizing raw data; aggregates updates, not data.
Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.
Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.
Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.
Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.
Maliciously inserting or altering training data to implant backdoors or degrade performance.
Reconstructing a model or its capabilities via API queries or leaked artifacts.
AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.
Measures how much information an observable random variable carries about unknown parameters.
Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.
A narrow minimum often associated with poorer generalization.
Estimating parameters by maximizing likelihood of observed data.
A wide basin often correlated with better generalization.
Gradually increasing learning rate at training start to avoid divergence.
Attention mechanisms that reduce quadratic complexity.
Recovering training data from gradients.
Inferring sensitive features of training data.
Simultaneous Localization and Mapping for robotics.
Recovering 3D structure from images.
Predicting future values from past observations.
Cost of model training.
Using production outcomes to improve models.
Measures similarity and projection between vectors.
Sensitivity of a function to input perturbations.
Matrix of first-order derivatives for vector-valued functions.
Direction of steepest ascent of a function.