What They Are
Computers don't understand words. They understand numbers. An embedding is a way to convert something meaningful (a word, a sentence, an image) into a list of numbers — typically hundreds or thousands of them — that captures its meaning.
The breakthrough: similar things get similar numbers. The embedding for "king" is close to "queen." The embedding for "dog" is close to "puppy" and far from "airplane." The embedding for a photo of a sunset is close to text saying "beautiful sunset."
This lets you do something powerful: measure how similar two things are by comparing their embeddings. That's the foundation of semantic search, recommendation systems, and RAG.
Why They Matter
Embeddings quietly power some of the most important AI applications:
Semantic search: Instead of matching keywords, search by meaning. "How to fix a leaky faucet" matches articles about "plumbing repair" because their embeddings are similar even though the words are different.
RAG systems: When you build a RAG pipeline, you embed your documents and the user's question, then find the documents with the most similar embeddings. That's how the system knows which information is relevant.
Recommendations: Netflix, Spotify, and Amazon embed content and user preferences into the same space. Recommend things whose embeddings are close to what the user has liked before.
Clustering and classification: Group similar items together by clustering their embeddings. Find outliers by looking for embeddings far from any cluster.
How They Work
Embeddings are produced by neural networks trained specifically for the task. The network processes the input (text, image, audio) and outputs a fixed-size vector of numbers.
For text, models like OpenAI's text-embedding-3, Cohere's Embed, or open-source models like BGE and E5 convert text into vectors of 768, 1024, or 1536 dimensions. Each dimension captures some aspect of meaning — not interpretable by humans, but mathematically useful.
The training process teaches the model that similar inputs should have similar vectors. One common approach: show the model pairs of similar sentences and dissimilar sentences, and train it so similar pairs end up close together in the vector space (contrastive learning).
A cool property: embeddings can encode relationships. The classic example: king - man + woman ≈ queen. The vector math captures the semantic relationship between these concepts.
Vector databases like Pinecone, Weaviate, Chroma, and Qdrant are specifically designed to store and search embeddings efficiently. They use approximate nearest neighbor algorithms to find similar vectors among millions in milliseconds.
Key Examples
Google Search: Uses embeddings to understand queries beyond keyword matching. Searching "tips for first marathon" returns results about "beginner running advice" because the embeddings are close.
Spotify: Embeds songs and listener preferences to power Discover Weekly playlists.
GitHub Copilot: Embeds code to understand context and suggest relevant completions.
CLIP: OpenAI's model that embeds images and text into the same space, enabling text-based image search and powering diffusion models.
Where to Go Next
- → RAG — embeddings in action for knowledge retrieval
- → Transformers — the models that produce embeddings
- → Multimodal AI — embedding different data types together
- → NLP — text understanding powered by embeddings