RAG sizing & cost

AI toolkit

Plan a retrieval system end to end. Inputs are document count, average length, chunk size and overlap, and embedding model. Outputs are chunks, vector DB size at float32, one-time embedding cost, and a monthly cost estimate across three popular hosts.

Corpus

Number of documents

Avg length (tokens)

Chunking

Chunk size (tokens)

Overlap (tokens)

Effective stride: 448 tokens per chunk → 3 chunks per document.

Embedding model

Sizing

Total chunks: 15,000
Total tokens to embed: 7.68M
Vector DB size (float32): 91.6 MB
One-time embedding cost: $0.1536

Estimated monthly storage cost

Pinecone Serverless: $0.0295 /mo
pgvector (self-hosted): $6.00 /mo
Chroma (self-hosted): $6.00 /mo

Pinecone uses ~$0.33 / GB-month storage; reads/writes are billed separately and not included here. Self-hosted estimates assume a small VPS as a floor.