Next-Token Prediction

Intermediate

Training objective where the model predicts the next token given previous tokens (causal modeling).

AdvertisementAd space — term-top

Why It Matters

Next-token prediction is a fundamental concept in training language models, enabling them to generate coherent and contextually appropriate text. Its application is crucial in various AI-driven technologies, such as chatbots and writing assistants, making it a key component in the evolution of natural language processing.

Next-token prediction is a training objective employed in language models, particularly autoregressive models, where the goal is to predict the subsequent token in a sequence given the preceding tokens. Mathematically, this can be expressed as maximizing the conditional probability P(w_t | w_1, w_2, ..., w_{t-1}), where w_t is the next token and w_1 to w_{t-1} are the previous tokens. This approach relies on the principles of causal modeling, where the model learns to generate sequences in a unidirectional manner, conditioning on past information. The training process typically involves the use of cross-entropy loss to measure the discrepancy between the predicted probabilities and the actual next token. Next-token prediction serves as a foundational concept for various architectures, including transformers and recurrent neural networks, and is critical for applications in text generation, dialogue systems, and machine translation.

Keywords

Domains

Related Terms

Welcome to AI Glossary

The free, self-building AI dictionary. Help us keep it free—click an ad once in a while!

Search

Type any question or keyword into the search bar at the top.

Browse

Tap a letter in the A–Z bar to browse terms alphabetically, or filter by domain, industry, or difficulty level.

3D WordGraph

Fly around the interactive 3D graph to explore how AI concepts connect. Click any word to read its full definition.