Definition
A vector database is a database optimised for storing and querying high-dimensional vectors — typically embeddings — supporting fast approximate-nearest-neighbour (ANN) search across millions or billions of vectors with sub-second latency.
Traditional databases search by exact match or range. Vector databases search by similarity — given a query vector, return the K vectors closest to it in the high-dimensional space. The search is approximate (true nearest-neighbour search at billion-scale is computationally infeasible) but high-quality, using indexes like HNSW, IVF, or product quantization.
The vector-DB landscape splits into two camps: dedicated services (Pinecone, Weaviate, Qdrant, Chroma) and extensions to existing databases (pgvector for Postgres, the closest in-database vector type). For most use cases up to a few million vectors, pgvector is the right answer — operational simplicity beats marginal performance gains. Beyond that, dedicated services start paying for themselves through specialisation.
Origin
Vector search techniques predate the modern category by decades (KD-trees, locality-sensitive hashing). The vector-DB category as a product crystallised around 2021–2022 with Pinecone's launch and the explosion of LLM applications driving demand for embedding storage.
How it works
- Insert: embedding vector + metadata stored together, indexed for fast search.
- Index: HNSW (hierarchical navigable small worlds) or IVF (inverted file) builds graph structures for fast approximate nearest-neighbour lookup.
- Query: embed the query, retrieve top-K closest vectors plus their metadata.
- Filter: combine vector search with metadata filters ("vectors closest to X, where category = Y").
- Update: incremental insert/delete/update without rebuilding the entire index.
- Scale: sharding and replication across nodes for billion-vector workloads.
When to use it
Use when
- Semantic search over large content corpora.
- RAG systems requiring fast retrieval.
- Recommendation systems based on similarity.
- Anomaly detection by vector distance.
Skip when
- Below a few thousand vectors — load into memory directly.
- When pgvector inside an existing Postgres covers the requirement (often does for under 1M vectors).
Key metrics
- Retrieval recall (vs. exact nearest-neighbour).
- Query latency (P50, P95, P99).
- Index build time.
- Cost per vector stored.
- Throughput (queries per second).
Examples
- The vector database powers semantic search across our 50,000-document knowledge base.
- Without a vector database, RAG falls over at scale.
- We stayed on pgvector through 2M vectors — only switched to Pinecone when we needed multi-region replication.
In practice at Makreate
Makreate AI builds use the right vector database for the workload — sometimes pgvector inside the existing Postgres, sometimes a dedicated service. On a recent client engagement we started with pgvector for 800K embeddings (operational simplicity, single database to operate). When the corpus crossed 4M vectors and query latency started degrading, we migrated to Qdrant — but only when the need was real, not preemptively.
AI Web App Development →Common mistakes
- Reaching for a dedicated vector DB when pgvector would do.
- Choosing on benchmarks alone. Operational fit matters more.
- Ignoring metadata filtering. Most production queries combine vector similarity with metadata constraints.
- Bad index parameters. HNSW's M and efSearch dramatically affect recall and latency.
- Not testing recall on real queries. Vector-DB ANN search is approximate; verify it's accurate enough.
Frequently asked
Pinecone, Weaviate, Qdrant, or pgvector?
pgvector if you already have Postgres and have under ~5M vectors. Qdrant or Weaviate for self-hosted dedicated workloads. Pinecone for managed serverless. Choose by operational fit, not benchmark.
How many dimensions can I store?
Most vector DBs handle 1536–3072 dimensions comfortably. Higher dimensions cost more in storage and search time. Some embedding models support truncation (Matryoshka embeddings) for trading precision against cost.
How do I migrate vector data?
Export with metadata, re-ingest at the destination. The vectors themselves don't need re-computation if the embedding model is the same — only the index needs rebuilding.