An attention mechanism where one sequence attends to a different sequence.
An attention mechanism where one sequence attends to a different sequence. In encoder-decoder models, the decoder uses cross-attention to look at the encoder's output. In text-to-image models, it's how the image generation process attends to the text prompt. Distinct from self-attention where a sequence attends to itself.
A mechanism that lets neural networks focus on the most relevant parts of their input when producing output.
An attention mechanism where a sequence attends to itself — each element looks at all other elements to understand relationships.
A neural network architecture with two parts: an encoder that processes the input into a representation, and a decoder that generates the output from that representation.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.