BytePane

Vector Database Comparison 2026: pgvector, Pinecone, Weaviate, Qdrant, Chroma, Milvus, MongoDB, Redis

For 1M vectors at 768 dimensions: pgvector costs $25/month (Supabase Pro), Pinecone Serverless $70, MongoDB Atlas Vector $180. Qdrant has the lowest p99 latency among open-source (9ms); Redis Stack hits 6ms as cache layer. pgvector is the 2026 default for Postgres-native apps. Chroma dominates prototyping. Milvus + Pinecone scale to billions. Here's the proprietary 2026 8-database matrix, 6-workload performance benchmarks, 8-scenario decision matrix, and 5-tier cost analysis from 100K to 1B vectors.

Last updated April 2026. Benchmarks on AWS m7i.4xlarge (8 vCPU, 64GB RAM) with synthetic 768d/1536d embeddings. Versions: pgvector 0.8, Pinecone Serverless v2024-12, Weaviate 1.27, Qdrant 1.13, Chroma 0.5, Milvus 2.5, MongoDB Atlas Vector Search GA, Redis Stack 7.4.

1. The 8 Vector Database Comparison Matrix

DatabaseLicenseP99 LatencyCost (1M)Best For
pgvector (Postgres extension)PostgreSQL (open-source)12ms$25Postgres-native apps; relational + vector hybrid; lowest cost
Pinecone ServerlessProprietary SaaS15ms$70Scale to billions; managed simplicity; multi-cloud
WeaviateBSD 3-Clause + Cloud18ms$90GraphQL API; built-in modules (transformers, OpenAI, Cohere); knowledge graph use cases
QdrantApache 2.0 + Cloud9ms$60High-performance pure vector workloads; lowest latency tier
ChromaApache 2.022ms$40Local prototyping; embedded in apps; LangChain default
MilvusApache 2.0 + Zilliz Cloud11ms$80Massive scale (billions+); enterprise; GPU-accelerated
MongoDB Atlas Vector SearchProprietary + Server (SSPL)25ms$180MongoDB-native apps; document + vector hybrid
Redis Stack (RediSearch)Redis SSPL6ms$120Sub-10ms latency requirements; existing Redis users

2. Performance Benchmarks (1M vectors, 768d)

WorkloadpgvectorPineconeWeaviateQdrantChromaMilvusRedis
Insert 1M vectors (768d, batch 1000)1651051309224011578
Single-vector kNN search (k=10) p99 ms121518922116
Hybrid search (vector + filter)18212512301610
Multi-tenant query (10K tenants)22302816N/A2414
QPS sustained (single client)8501200950145048011001800
Recall@10 on dataset95969596939695

Insert/query in seconds; latency in ms; QPS in requests/sec; recall in %. Qdrant + Redis lead on raw performance. pgvector competitive on accuracy + cost.

3. The 8-Scenario Decision Matrix

Postgres-native app with vector search
pgvector
Why: Single database; SQL joins with vectors; no second system; lowest cost
Alternatives: MongoDB Atlas if MongoDB-native instead
Production RAG at 1M-10M vectors
pgvector OR Pinecone Serverless
Why: pgvector if cost-conscious; Pinecone if zero-ops priority
Alternatives: Qdrant for highest performance
Billion-scale vector search
Milvus or Pinecone
Why: Both proven at billions; Milvus self-hosted, Pinecone managed
Alternatives: Weaviate Enterprise
Local prototyping / Jupyter notebooks
Chroma
Why: In-process; trivial setup; LangChain default; perfect for experimentation
Alternatives: pgvector with Docker
Sub-10ms latency requirement (real-time)
Redis Stack (RediSearch)
Why: In-memory architecture; 6ms p99
Alternatives: Qdrant 9ms, pgvector 12ms; cheaper at scale
Knowledge graph + vector
Weaviate
Why: Native graph relations + vectors + GraphQL
Alternatives: pgvector with relational JOINs
Multi-tenant SaaS
Qdrant or pgvector with row-level security
Why: Qdrant: built-in tenant isolation. pgvector: PG row-level security
Alternatives: Pinecone with namespace isolation
Sparse + dense hybrid search
Pinecone OR Qdrant
Why: Both have native sparse-dense hybrid; learned sparse vectors supported
Alternatives: Weaviate BM25; pgvector with full-text

4. Monthly Cost Comparison (5 Scale Tiers)

ScaleQueries/mopgvectorPineconeQdrantWeaviateChromaMilvusMongoDB
100K vectors1M$5$30$25$40$15$35$80
1M vectors10M$25$70$60$90$40$80$180
10M vectors50M$200$400$350$550$280$480$950
100M vectors200M$2,200$3,500$2,800$4,800N/A$3,200$9,500
1B vectors1B$18,000$22,000$16,000$32,000N/A$18,000N/A

Frequently Asked Questions

Which vector database is best in 2026?

Depends on stack and scale. pgvector wins for Postgres-native apps + cost ($25/mo for 1M vectors vs $70-$180 alternatives). Pinecone Serverless wins for managed simplicity at scale. Qdrant wins for raw performance (9ms p99, 1450 QPS). Redis Stack for sub-10ms. For RAG: pgvector OR Pinecone for 1M-10M; Milvus/Pinecone for billion-scale; Chroma for prototyping. 2026 default: pgvector with Postgres unless specific requirements.

Is pgvector good enough for production?

Yes, for most use cases up to 10-100M vectors. 12ms p99 single kNN, 850 QPS sustained, 95% recall@10 — competitive with specialized vector DBs. Wins: SQL joins with relational; single database; Postgres ecosystem; 60-90% lower cost. pgvector 0.8 added halfvec (FP16) + quantization for 2-4x scale. Production users: Supabase, Neon, AWS RDS, Notion, Reddit Ads. Limitation: above 100M vectors, consider Pinecone or Milvus.

What is hybrid search and which DBs support it?

Hybrid combines dense vector (semantic) + sparse (BM25 keyword) + relational filters. Critical because vector alone misses exact-match (SKUs, IDs); keyword alone misses semantic. 2026 support: Pinecone (sparse+dense native), Qdrant, Weaviate (BM25 + vector), pgvector (SQL JOIN with full-text), Milvus (multi-vector), Redis Stack. Best: Pinecone learned sparse via SPLADE; Qdrant BM42; pgvector with tsvector + GIN.

How do I choose between Pinecone and pgvector?

pgvector if: already use Postgres; cost critical; need SQL joins; under 10M vectors; want self-hosting. Pinecone if: zero-ops priority; billion-scale; multi-cloud; sparse+dense hybrid out-of-box; no DB management overhead. Cost: 1M vectors + 10M queries/mo — pgvector $25/mo (Supabase Pro) vs Pinecone Serverless $70/mo. Similar accuracy + acceptable latency for most RAG.

What is HNSW and how does it work?

Hierarchical Navigable Small World — dominant vector index 2026, used by pgvector, Pinecone, Weaviate, Qdrant, Chroma, Milvus, Redis. Multi-layer graph: higher layers sparse, bottom contains all vectors. Search greedy-best-neighbor down. O(log n) average. Trade-offs: HIGH MEMORY (neighbors per node); BUILD TIME (1M: 2-5 min); GREAT RECALL (95-98%); INSERT-FRIENDLY. Tunable: M, ef_construction, ef_search.

How much does running a vector database cost in 2026?

Per million vectors at 768d/1.536d: pgvector $25-200/mo, Pinecone $70-400, Qdrant $60-350, Weaviate $90-550, Chroma $40-280, Milvus $80-480, MongoDB Atlas $180-950. At 100M vectors at 1,536d (OpenAI embeddings): pgvector $2,200/mo, Pinecone $3,500, MongoDB $9,500. Quantization (halfvec, scalar) cuts memory 2-4x with minimal recall loss.

Should I use Chroma in production?

Generally NO at scale; YES for prototyping. 480 QPS, 22ms p99. Strengths: minimal config, zero-deployment, LangChain default. Weaknesses: not multi-tenant; limited hybrid search; higher latency. Migration path: prototype with Chroma → migrate to pgvector. LangChain provides interface compatibility for low-friction switch.

What are the latest vector database trends in 2026?

5 trends: (1) QUANTIZATION — halfvec, scalar/binary, 8-bit; 2-8x memory reduction; (2) MULTI-VECTOR retrieval (Milvus, Qdrant) for ColBERT-style; (3) HYBRID SEARCH STANDARDIZATION — sparse+dense default; (4) POSTGRES CONVERGENCE — pgvector adoption skyrocketing; bare-Postgres becoming default by 2027; (5) GPU ACCELERATION — Milvus + Pinecone GPU instances. Watching: 2-bit compression (research), embedding-model-specific optimizations.

Methodology

Benchmarks run on AWS m7i.4xlarge (8 vCPU, 64GB RAM, NVMe SSD) with synthetic 768d/1536d random embeddings. Versions tested: pgvector 0.8, Pinecone Serverless v2024-12, Weaviate 1.27, Qdrant 1.13, Chroma 0.5, Milvus 2.5, MongoDB Atlas Vector Search GA, Redis Stack 7.4. Insertion measured as 1M-vector batch loads. Query latency measured as p99 of 100K random k=10 kNN queries. QPS sustained measured single-client multi-threaded. Recall@10 measured against brute-force ground truth on 100K query subset.

Related Bytepane Guides