Search: direct preference optimization

Grounding Intermediate

Constraining outputs to retrieved or provided sources, often with citation, to improve factual reliability.

Foundations & Theory

LSTM Intermediate

An RNN variant using gates to mitigate vanishing gradients and capture longer context.

Foundations & Theory

Fine-Tuning Intermediate

Updating a pretrained model’s weights on task-specific data to improve performance or adapt style/behavior.

Large Language Models

Reward Model Intermediate

Model trained to predict human preferences (or utility) for candidate outputs; used in RLHF-style pipelines.

Foundations & Theory

Alignment Intermediate

Ensuring model behavior matches human goals, norms, and constraints, including reducing harmful or deceptive outputs.

Foundations & Theory

Curriculum Learning Intermediate

Ordering training samples from easier to harder to improve convergence or generalization.

Foundations & Theory

Federated Learning Intermediate

Training across many devices/silos without centralizing raw data; aggregates updates, not data.

Foundations & Theory

MLOps Intermediate

Practices for operationalizing ML: versioning, CI/CD, monitoring, retraining, and reliable production management.

MLOps & Infrastructure

Softmax Intermediate

Converts logits to probabilities by exponentiation and normalization; common in classification and LMs.

Foundations & Theory

Pruning Intermediate

Removing weights or neurons to shrink models and improve efficiency; can be structured or unstructured.

Foundations & Theory

Adversarial Example Intermediate

Inputs crafted to cause model errors or unsafe behavior, often imperceptible in vision or subtle in text.

Foundations & Theory

Data Poisoning Intermediate

Maliciously inserting or altering training data to implant backdoors or degrade performance.

Foundations & Theory

Model Stealing Intermediate

Reconstructing a model or its capabilities via API queries or leaked artifacts.

Foundations & Theory

Computer Vision Intermediate

AI focused on interpreting images/video: classification, detection, segmentation, tracking, and 3D understanding.

Computer Vision

Fisher Information Intermediate

Measures how much information an observable random variable carries about unknown parameters.

AI Economics & Strategy

Segmentation Intermediate

Assigning labels per pixel (semantic) or per instance (instance segmentation) to map object boundaries.

Computer Vision

Sharp Minimum Intermediate

A narrow minimum often associated with poorer generalization.

AI Economics & Strategy

Maximum Likelihood Estimation Intermediate

Estimating parameters by maximizing likelihood of observed data.

AI Economics & Strategy

Flat Minimum Intermediate

A wide basin often correlated with better generalization.

AI Economics & Strategy

Warmup Intermediate

Gradually increasing learning rate at training start to avoid divergence.

AI Economics & Strategy

Sparse Attention Intermediate

Attention mechanisms that reduce quadratic complexity.

AI Economics & Strategy

Gradient Leakage Intermediate

Recovering training data from gradients.

AI Economics & Strategy

Model Inversion Intermediate

Inferring sensitive features of training data.

AI Economics & Strategy

SLAM Intermediate

Simultaneous Localization and Mapping for robotics.

Computer Vision

3D Reconstruction Intermediate

Recovering 3D structure from images.

Computer Vision

Forecasting Intermediate

Predicting future values from past observations.

Time Series

Training Cost Intermediate

Cost of model training.

AI Economics & Strategy

Feedback Loop Intermediate

Using production outcomes to improve models.

MLOps & Infrastructure

Inner Product Advanced

Measures similarity and projection between vectors.

Mathematics

Condition Number Advanced

Sensitivity of a function to input perturbations.

Mathematics

Results for "direct preference optimization"

Welcome to AI Glossary

Search

Browse

3D WordGraph