The part of a neural network that generates output from an internal representation.
The part of a neural network that generates output from an internal representation. In transformers, the decoder produces tokens one at a time, attending to both the encoded input and previously generated tokens. GPT-style models are decoder-only architectures.
The part of a neural network that processes input data into an internal representation.
The neural network architecture behind virtually all modern AI language models.
A model that generates output one piece at a time, with each new piece depending on all the previous ones.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.