TL;DR
The main AWS vector store options for RAG include OpenSearch, S3 Vectors, Aurora PostgreSQL with pgvector, Pinecone, and Redis. Each differs in latency, cost, and operational complexity.
Bedrock Knowledge Bases now supports six vector store backends directly, and AWS offers additional services with vector search for custom RAG pipelines. This article separates what works inside Bedrock KB from what requires a custom approach and provides a practical decision framework.
Table of Contents
Introduction
In early 2024, we explored the RAG options available on AWS. Back then, the vector store choices for Bedrock Knowledge Bases were limited, and OpenSearch Serverless was the default for most teams.
A lot has changed since then. Bedrock KB now supports six backends, and several other AWS services offer vector search for custom pipelines. This article covers all the options and how to choose.
Vector store options – AWS Managed vs. Direct
With Bedrock Knowledge Bases, AWS manages ingestion, chunking, embedding, and retrieval. The supported backends are:
- Amazon OpenSearch Serverless
- Amazon OpenSearch Managed Cluster (new)
- Amazon Aurora PostgreSQL Serverless with pgvector
- Amazon S3 Vectors (new)
- Pinecone
- Redis Enterprise Cloud
Going direct means you own the full RAG pipeline, which opens up additional AWS services (MemoryDB, Neptune Analytics, DocumentDB, Kendra). More flexibility, more operational overhead.
Bedrock Knowledge Bases vector store options
These vector stores appear in the Bedrock KB console dropdown. Bedrock manages the full pipeline for you.
1. Amazon OpenSearch service
The most mature Bedrock KB integration. Now supports both Serverless and Managed Clusters.
- Sub-10ms query latency, GPU acceleration available
- Scales to billions of vectors
- Hybrid search: keyword + vector similarity in one query
- Serverless: ~$700/month minimum (OCU pricing)
- Managed Cluster (new): full control over instance types and cost
2. Amazon S3 vectors
The biggest addition since 2024. GA since December 2025. Available as a “Quick create” option.
- Up to 2 billion vectors per index, 10,000 indexes per bucket
- Sub-100ms query latency
- Up to 90% cost reduction vs dedicated vector databases
- Serverless, 11 9s durability, no idle minimums
3. Amazon Aurora Postgre SQL Serverless with pg vector
Unchanged since 2024. Best if you already run Aurora and want to consolidate.
- ACID transactions and strong consistency
- Mixed SQL and vector queries in one database
- Millions of vectors, 10-100ms latency
4. Pinecone
The only purpose-built vector database in the Bedrock KB dropdown. Requires a separate Pinecone account.
- Purpose-built for vector similarity search
- Metadata filtering, namespaces, sparse-dense hybrid search
- Not AWS-native — adds vendor dependency and separate billing
5. Redis Enterprise Cloud
In-memory vector operations via RediSearch. Also requires an external account.
- Very high query performance (in-memory)
- Redis API compatibility
- External vendor dependency, cost scales with dataset size
If you move beyond Bedrock Knowledge Bases, the range of AWS vector search services expands significantly.
AWS vector search services for custom RAG pipelines
These services have vector search but are not in the Bedrock KB dropdown. You build your own pipeline.
1. Amazon MemoryDB
Fastest vector search on AWS. Redis-compatible, in-memory.
- Sub-millisecond query latency
- Up to 32,768 dimensions per vector
- Strong consistency, millions of requests per day
- Cost scales linearly with dataset size (RAM-priced)
2. Amazon Neptune Analytics
Purpose-built for GraphRAG. Integrates with Bedrock via a separate graph-based KB creation flow.
- Vector search + graph traversals and algorithms
- Scales to billions of relationships
- Up to 80x faster than traditional graph solutions
3. Amazon DocumentDB
Native vector search in version 5.0+. MongoDB API compatible.
- Up to 2,000 dimensions (indexed) or 16,000 (unindexed)
- Millisecond response times
- Cosine, Euclidean, and dot product metrics
4. Amazon Kendra
Managed enterprise search with built-in RAG. Abstracts away the vector layer entirely.
- 40+ native data source connectors
- Built-in NLP and relevance tuning
- Pay per query and storage, minimal setup
Quick reference
Bedrock Knowledge Bases vector store options
Option | Latency | Scale | Cost Profile | Best For |
OpenSearch Serverless | Sub-10ms | Billions | OCU hours (~$700/mo min) | High-throughput, low-latency |
OpenSearch Managed Cluster | Sub-10ms | Billions | Instance hours | Full control, cost tuning |
S3 Vectors | Sub-100ms | 2B per index | Storage + queries (no idle) | Cost-optimized, infrequent access |
Aurora pgvector | 10-100ms | Millions | Instance hours | Hybrid SQL + vector |
Pinecone / Redis EC | Sub-10ms | Billions / Millions | Vendor pricing | Third-party, multi-cloud |
AWS vector search services (Custom RAG pipeline required)
Option | Latency | Scale | Cost Profile | Best For |
MemoryDB | Sub-1ms | Millions | Instance hours (memory) | Ultra-low latency, real-time |
Neptune Analytics | Sub-second | Billions | Capacity units | GraphRAG, knowledge graphs |
DocumentDB | Millisecond | Millions | Instance hours | MongoDB-compatible apps |
Kendra | Sub-second | Managed | Pay per query | Enterprise search, minimal setup |
The right vector store for you – How to choose
1. Decision framework
Start here: do you want Bedrock KB to manage the pipeline? If yes, your options are OpenSearch, S3 Vectors, Aurora pgvector, Pinecone, or Redis EC. If no, the full landscape opens up.
Then ask:
- What latency do you need?
Sub-millisecond → MemoryDB. Sub-10ms → OpenSearch. Sub-100ms → S3 Vectors. Sub-second → Kendra or Neptune.
- What are you already running?
Aurora → pgvector. MongoDB → DocumentDB. Redis → MemoryDB. Nothing → OpenSearch or S3 Vectors.
- Query pattern?
High-QPS → OpenSearch. Infrequent → S3 Vectors. Graph traversals → Neptune. Mixed SQL + vector → Aurora pgvector.
2. Cost comparison
Cost varies significantly across options. The biggest differentiator is whether you pay idle costs when the system is not handling queries. S3 Vectors is the only option with zero idle cost, making it the standout for dev/test and cost-sensitive production. OpenSearch Serverless’s ~$700/month floor is the most common surprise teams hit.
Option | Min Monthly Cost | Pay-per-Query | Idle Cost | Best Cost Profile |
OpenSearch Serverless | ~$700 (2+2 OCUs) | No | Yes | Predictable production workloads |
OpenSearch Managed Cluster | ~$50+ (smallest instance) | No | Yes | Cost-tunable, right-size instances |
S3 Vectors | $0 (pay only for usage) | Yes | No | Dev/test, infrequent access, large scale |
Aurora pgvector | ~$50+ (serverless min) | No | Minimal (scales to zero) | Already running Aurora |
Pinecone | Varies (pod-based) | No | Yes | Vendor pricing |
Redis Enterprise Cloud | Varies (node-based) | No | Yes | Vendor pricing |
MemoryDB | ~$100+ (smallest node) | No | Yes | Ultra-low latency justifies cost |
Neptune Analytics | Capacity unit-based | No | Yes | GraphRAG workloads |
DocumentDB | ~$50+ (smallest instance) | No | Yes | Already running DocumentDB |
Kendra | ~$810/month (developer) | Yes | Yes | Enterprise search, many connectors |
Key takeaway: if cost is your primary constraint, S3 Vectors eliminates idle spend entirely. If you need OpenSearch but want to avoid the serverless minimum, consider a Managed Cluster where you can right-size to a smaller instance.
What hasn’t changed
The vector store matters, but chunking strategy, embedding model selection, and prompt engineering still have a bigger impact on RAG quality. All Bedrock KB backends are swappable without rewriting your application. Pick the one that fits your operational model and spend your energy on retrieval tuning.
Conclusion
Choosing an AWS vector store for RAG is less about features and more about trade-offs. Start with your operational model, then optimise for latency and cost. Most teams will default to OpenSearch or S3 Vectors, but the right answer depends on how much control you need.
Sources
- Retrieval Augmented Generation (RAG) Options in AWS — Cevo, April 2024: https://cevo.com.au/post/retrieval-augmented-generation-rag-options-in-aws/
- AWS Prescriptive Guidance — Choosing an AWS Vector Database for RAG Use Cases: https://docs.aws.amazon.com/prescriptive-guidance/latest/choosing-an-aws-vector-database-for-rag-use-cases/introduction.html
- Introducing Amazon S3 Vectors — AWS Blog: https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/
- Amazon S3 Vectors Features: https://aws.amazon.com/s3/features/vectors
- Bedrock KB OpenSearch Managed Cluster Support: https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-knowledge-bases-now-supports-amazon-opensearch-service-managed-cluster-as-vector-store/

Kiran is a Data Engineer with experience in designing and building Cloud Data applications and ETL pipelines for cloud data migration and building Enterprise Data Warehouses. Proven success in designing and optimising data pipelines to support business goals, and expertise in using a variety of data engineering tools and technologies.



