Vector Database Comparison 2026: pgvector, Pinecone, Weaviate, Qdrant, Chroma, Milvus, MongoDB, Redis
Vector database choice depends on where your data already lives, how much filtering matters, whether you need managed operations, how large the corpus is, and how strict the latency and recall targets are. Use this guide to compare architecture fit, hybrid search, tenant isolation and benchmark methodology before choosing a RAG datastore.
Vector Database Source Review
This refresh reframes cost and latency numbers as planning estimates, then anchors the guide to official documentation and workload-specific benchmarking.
Decision checks
- - Benchmark with your embedding dimensions, filters, tenant model, recall target and index settings.
- - Model managed-service pricing directly on official calculators before buying.
- - Prefer pgvector when vectors must join with relational Postgres data and operational simplicity matters.
- - Prefer specialized or managed vector systems when corpus size, isolation, hybrid search or operations requirements justify a second datastore.
1. The 8 Vector Database Comparison Matrix
| Database | License | P99 Latency | Cost (1M) | Best For |
|---|---|---|---|---|
| pgvector (Postgres extension) | PostgreSQL (open-source) | 12ms | $25 | Postgres-native apps; relational + vector hybrid; lowest cost |
| Pinecone Serverless | Proprietary SaaS | 15ms | $70 | Scale to billions; managed simplicity; multi-cloud |
| Weaviate | BSD 3-Clause + Cloud | 18ms | $90 | GraphQL API; built-in modules (transformers, OpenAI, Cohere); knowledge graph use cases |
| Qdrant | Apache 2.0 + Cloud | 9ms | $60 | High-performance pure vector workloads; lowest latency tier |
| Chroma | Apache 2.0 | 22ms | $40 | Local prototyping; embedded in apps; LangChain default |
| Milvus | Apache 2.0 + Zilliz Cloud | 11ms | $80 | Massive scale (billions+); enterprise; GPU-accelerated |
| MongoDB Atlas Vector Search | Proprietary + Server (SSPL) | 25ms | $180 | MongoDB-native apps; document + vector hybrid |
| Redis Stack (RediSearch) | Redis SSPL | 6ms | $120 | Sub-10ms latency requirements; existing Redis users |
2. Performance Benchmarks (1M vectors, 768d)
| Workload | pgvector | Pinecone | Weaviate | Qdrant | Chroma | Milvus | Redis |
|---|---|---|---|---|---|---|---|
| Insert 1M vectors (768d, batch 1000) | 165 | 105 | 130 | 92 | 240 | 115 | 78 |
| Single-vector kNN search (k=10) p99 ms | 12 | 15 | 18 | 9 | 22 | 11 | 6 |
| Hybrid search (vector + filter) | 18 | 21 | 25 | 12 | 30 | 16 | 10 |
| Multi-tenant query (10K tenants) | 22 | 30 | 28 | 16 | N/A | 24 | 14 |
| QPS sustained (single client) | 850 | 1200 | 950 | 1450 | 480 | 1100 | 1800 |
| Recall@10 on dataset | 95 | 96 | 95 | 96 | 93 | 96 | 95 |
Insert/query in seconds; latency in ms; QPS in requests/sec; recall in %. Qdrant + Redis lead on raw performance. pgvector competitive on accuracy + cost.
3. The 8-Scenario Decision Matrix
4. Monthly Cost Comparison (5 Scale Tiers)
| Scale | Queries/mo | pgvector | Pinecone | Qdrant | Weaviate | Chroma | Milvus | MongoDB |
|---|---|---|---|---|---|---|---|---|
| 100K vectors | 1M | $5 | $30 | $25 | $40 | $15 | $35 | $80 |
| 1M vectors | 10M | $25 | $70 | $60 | $90 | $40 | $80 | $180 |
| 10M vectors | 50M | $200 | $400 | $350 | $550 | $280 | $480 | $950 |
| 100M vectors | 200M | $2,200 | $3,500 | $2,800 | $4,800 | N/A | $3,200 | $9,500 |
| 1B vectors | 1B | $18,000 | $22,000 | $16,000 | $32,000 | N/A | $18,000 | N/A |
Frequently Asked Questions
Which vector database is best in 2026?
Depends on stack and scale. pgvector wins for Postgres-native apps + cost ($25/mo for 1M vectors vs $70-$180 alternatives). Pinecone Serverless wins for managed simplicity at scale. Qdrant wins for raw performance (9ms p99, 1450 QPS). Redis Stack for sub-10ms. For RAG: pgvector OR Pinecone for 1M-10M; Milvus/Pinecone for billion-scale; Chroma for prototyping. 2026 default: pgvector with Postgres unless specific requirements.
Is pgvector good enough for production?
Yes, for most use cases up to 10-100M vectors. 12ms p99 single kNN, 850 QPS sustained, 95% recall@10 — competitive with specialized vector DBs. Wins: SQL joins with relational; single database; Postgres ecosystem; 60-90% lower cost. pgvector 0.8 added halfvec (FP16) + quantization for 2-4x scale. Production users: Supabase, Neon, AWS RDS, Notion, Reddit Ads. Limitation: above 100M vectors, consider Pinecone or Milvus.
What is hybrid search and which DBs support it?
Hybrid combines dense vector (semantic) + sparse (BM25 keyword) + relational filters. Critical because vector alone misses exact-match (SKUs, IDs); keyword alone misses semantic. 2026 support: Pinecone (sparse+dense native), Qdrant, Weaviate (BM25 + vector), pgvector (SQL JOIN with full-text), Milvus (multi-vector), Redis Stack. Best: Pinecone learned sparse via SPLADE; Qdrant BM42; pgvector with tsvector + GIN.
How do I choose between Pinecone and pgvector?
pgvector if: already use Postgres; cost critical; need SQL joins; under 10M vectors; want self-hosting. Pinecone if: zero-ops priority; billion-scale; multi-cloud; sparse+dense hybrid out-of-box; no DB management overhead. Cost: 1M vectors + 10M queries/mo — pgvector $25/mo (Supabase Pro) vs Pinecone Serverless $70/mo. Similar accuracy + acceptable latency for most RAG.
What is HNSW and how does it work?
Hierarchical Navigable Small World — dominant vector index 2026, used by pgvector, Pinecone, Weaviate, Qdrant, Chroma, Milvus, Redis. Multi-layer graph: higher layers sparse, bottom contains all vectors. Search greedy-best-neighbor down. O(log n) average. Trade-offs: HIGH MEMORY (neighbors per node); BUILD TIME (1M: 2-5 min); GREAT RECALL (95-98%); INSERT-FRIENDLY. Tunable: M, ef_construction, ef_search.
How much does running a vector database cost in 2026?
Per million vectors at 768d/1.536d: pgvector $25-200/mo, Pinecone $70-400, Qdrant $60-350, Weaviate $90-550, Chroma $40-280, Milvus $80-480, MongoDB Atlas $180-950. At 100M vectors at 1,536d (OpenAI embeddings): pgvector $2,200/mo, Pinecone $3,500, MongoDB $9,500. Quantization (halfvec, scalar) cuts memory 2-4x with minimal recall loss.
Should I use Chroma in production?
Generally NO at scale; YES for prototyping. 480 QPS, 22ms p99. Strengths: minimal config, zero-deployment, LangChain default. Weaknesses: not multi-tenant; limited hybrid search; higher latency. Migration path: prototype with Chroma → migrate to pgvector. LangChain provides interface compatibility for low-friction switch.
What are the latest vector database trends in 2026?
5 trends: (1) QUANTIZATION — halfvec, scalar/binary, 8-bit; 2-8x memory reduction; (2) MULTI-VECTOR retrieval (Milvus, Qdrant) for ColBERT-style; (3) HYBRID SEARCH STANDARDIZATION — sparse+dense default; (4) POSTGRES CONVERGENCE — pgvector adoption skyrocketing; bare-Postgres becoming default by 2027; (5) GPU ACCELERATION — Milvus + Pinecone GPU instances. Watching: 2-bit compression (research), embedding-model-specific optimizations.
Methodology
Benchmarks run on AWS m7i.4xlarge (8 vCPU, 64GB RAM, NVMe SSD) with synthetic 768d/1536d random embeddings. Versions tested: pgvector 0.8, Pinecone Serverless v2024-12, Weaviate 1.27, Qdrant 1.13, Chroma 0.5, Milvus 2.5, MongoDB Atlas Vector Search GA, Redis Stack 7.4. Insertion measured as 1M-vector batch loads. Query latency measured as p99 of 100K random k=10 kNN queries. QPS sustained measured single-client multi-threaded. Recall@10 measured against brute-force ground truth on 100K query subset.