The attention mechanism is a technique that lets neural networks focus on the most relevant parts of their input when producing output.
The attention mechanism is a technique that lets neural networks focus on the most relevant parts of their input when producing output. Instead of treating all input tokens equally, attention assigns different weights to different tokens based on their relevance to the current step. It's the core innovation behind transformers and modern LLMs.
Introduced in the landmark "Attention Is All You Need" paper (2017), the attention mechanism revolutionized NLP. In self-attention, each token in a sequence computes how much it should "attend to" every other token. This allows models to capture long-range dependencies — for example, connecting a pronoun to the noun it refers to even if they're far apart in the text.
When translating "The cat sat on the mat because it was tired," attention helps the model understand that "it" refers to "the cat" by assigning high attention weights between those tokens.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.