A database optimized for storing and searching high-dimensional vectors (embeddings).
A database optimized for storing and searching high-dimensional vectors (embeddings). When you build a RAG system, you store document embeddings in a vector database and search it with query embeddings. Pinecone, Weaviate, Chroma, and pgvector are popular options. Essential infrastructure for AI applications.
A vector database is a specialized database designed to store and search embedding vectors efficiently. Traditional databases search by exact matches or text patterns. Vector databases search by similarity — finding the vectors closest to a query vector in high-dimensional space. This is the backbone of RAG systems, semantic search, and recommendation engines.
The technical challenge is speed. A brute-force search comparing a query against millions of vectors is too slow. Vector databases use approximate nearest neighbor (ANN) algorithms — like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index) — that trade a tiny bit of accuracy for massive speed improvements. Popular options include Pinecone (managed service), Weaviate (open-source), Qdrant, Chroma, and pgvector (PostgreSQL extension).
Choosing the right vector database depends on your scale and requirements. For prototypes and small datasets, Chroma or pgvector work fine and keep your stack simple. For production systems with millions of vectors and low-latency requirements, dedicated solutions like Pinecone or Weaviate handle the infrastructure concerns. Key considerations include: indexing speed, query latency, filtering capabilities (searching vectors within a subset), and whether you need metadata storage alongside vectors. For RAG applications, the vector database is often the most important infrastructure decision after choosing the LLM itself.
"We store 2 million document embeddings in Pinecone and query it for every user question — retrieval takes about 50ms and feeds the top 5 results to Claude as context."
A dense numerical representation of data (words, images, etc.
Retrieval-Augmented Generation.
Search that understands meaning and intent rather than just matching keywords.
A mathematical function applied to a neuron's output that introduces non-linearity into the network.
An optimization algorithm that combines the best parts of two other methods — AdaGrad and RMSProp.
Artificial General Intelligence.
Browse our complete glossary or subscribe to our newsletter for the latest AI news and insights.