The fundamental task that language models are trained on: given a sequence of tokens, predict what comes next.
The fundamental task that language models are trained on: given a sequence of tokens, predict what comes next. Despite this seeming simplicity, training at massive scale produces models that can reason, code, translate, and write creatively. It's the core insight behind GPT and similar models.
A model that generates output one piece at a time, with each new piece depending on all the previous ones.
An AI model that understands and generates human language.
The initial, expensive phase of training where a model learns general patterns from a massive dataset.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.