Vector Database
A database optimized for storing and querying high-dimensional vector embeddings at scale.
A vector database is a specialized database system designed to store, index, and query embedding vectors — the high-dimensional numerical representations of text, images, and other data. Unlike traditional databases optimized for exact matches, vector databases are built for approximate nearest-neighbor (ANN) search: finding the most semantically similar items quickly.
Popular vector databases include Pinecone, Weaviate, Qdrant, Chroma, and pgvector (a PostgreSQL extension). They support billions of vectors with millisecond query times using indexing algorithms like HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index).
Key Features
- ANN search — find similar vectors without checking all entries
- Metadata filtering — combine vector similarity with structured filters
- Scalability — handle millions to billions of vectors
- Real-time updates — add/delete vectors without rebuilding the index
Vector databases are a critical component of enterprise AI applications. They give LLMs access to private knowledge without retraining, making them the infrastructure layer of most RAG-based products. Choosing the right vector DB involves tradeoffs between hosted vs. self-hosted, query speed, cost, and filtering capabilities.