The basic unit of text that language models work with. Not quite a word — tokens can be whole words, parts of words, or even single characters. 'Understanding' might be one token; 'un' + 'der' + 'standing' might be three. Most models process about 1.3 tokens per English word. Token limits define context windows.
The component that converts raw text into tokens that a language model can process.
The maximum amount of text a language model can process at once, measured in tokens.
An AI model that understands and generates human language.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.