AWS Vector Store for RAG – Beyond OpenSearch 2026

TL;DR

The main AWS vector store options for RAG include OpenSearch, S3 Vectors, Aurora PostgreSQL with pgvector, Pinecone, and Redis. Each differs in latency, cost, and operational complexity.

Bedrock Knowledge Bases now supports six vector store backends directly, and AWS offers additional services with vector search for custom RAG pipelines. This article separates what works inside Bedrock KB from what requires a custom approach and provides a practical decision framework.

Table of Contents

Introduction

In early 2024, we explored the RAG options available on AWS. Back then, the vector store choices for Bedrock Knowledge Bases were limited, and OpenSearch Serverless was the default for most teams.

A lot has changed since then. Bedrock KB now supports six backends, and several other AWS services offer vector search for custom pipelines. This article covers all the options and how to choose.

Simple RAG Flow - AWS Vector Store for RAG
Simple RAG Flow

Vector store options – AWS Managed vs. Direct

With Bedrock Knowledge Bases, AWS manages ingestion, chunking, embedding, and retrieval. The supported backends are:

  • Amazon OpenSearch Serverless
  • Amazon OpenSearch Managed Cluster (new)
  • Amazon Aurora PostgreSQL Serverless with pgvector
  • Amazon S3 Vectors (new)
  • Pinecone
  • Redis Enterprise Cloud

 

Going direct means you own the full RAG pipeline, which opens up additional AWS services (MemoryDB, Neptune Analytics, DocumentDB, Kendra). More flexibility, more operational overhead.

Bedrock Knowledge Bases vector store options

These vector stores appear in the Bedrock KB console dropdown. Bedrock manages the full pipeline for you.

RAG Architecture with supported Vector Stores - AWS Vector Store for RAG
RAG Architecture with supported Vector Stores

1. Amazon OpenSearch service

The most mature Bedrock KB integration. Now supports both Serverless and Managed Clusters.

  • Sub-10ms query latency, GPU acceleration available
  • Scales to billions of vectors
  • Hybrid search: keyword + vector similarity in one query
  • Serverless: ~$700/month minimum (OCU pricing)
  • Managed Cluster (new): full control over instance types and cost

2. Amazon S3 vectors

The biggest addition since 2024. GA since December 2025. Available as a “Quick create” option.

  • Up to 2 billion vectors per index, 10,000 indexes per bucket
  • Sub-100ms query latency
  • Up to 90% cost reduction vs dedicated vector databases
  • Serverless, 11 9s durability, no idle minimums

3. Amazon Aurora Postgre SQL Serverless with pg vector

Unchanged since 2024. Best if you already run Aurora and want to consolidate.

  • ACID transactions and strong consistency
  • Mixed SQL and vector queries in one database
  • Millions of vectors, 10-100ms latency

4. Pinecone

The only purpose-built vector database in the Bedrock KB dropdown. Requires a separate Pinecone account.

  • Purpose-built for vector similarity search
  • Metadata filtering, namespaces, sparse-dense hybrid search
  • Not AWS-native — adds vendor dependency and separate billing

5. Redis Enterprise Cloud

In-memory vector operations via RediSearch. Also requires an external account.

  • Very high query performance (in-memory)
  • Redis API compatibility
  • External vendor dependency, cost scales with dataset size

 

If you move beyond Bedrock Knowledge Bases, the range of AWS vector search services expands significantly.

“Pick the vector store that fits your operational model, then spend your energy on retrieval tuning.”

AWS vector search services for custom RAG pipelines

These services have vector search but are not in the Bedrock KB dropdown. You build your own pipeline.

1. Amazon MemoryDB

Fastest vector search on AWS. Redis-compatible, in-memory.

  • Sub-millisecond query latency
  • Up to 32,768 dimensions per vector
  • Strong consistency, millions of requests per day
  • Cost scales linearly with dataset size (RAM-priced)

2. Amazon Neptune Analytics

Purpose-built for GraphRAG. Integrates with Bedrock via a separate graph-based KB creation flow.

  • Vector search + graph traversals and algorithms
  • Scales to billions of relationships
  • Up to 80x faster than traditional graph solutions

3. Amazon DocumentDB

Native vector search in version 5.0+. MongoDB API compatible.

  • Up to 2,000 dimensions (indexed) or 16,000 (unindexed)
  • Millisecond response times
  • Cosine, Euclidean, and dot product metrics

4. Amazon Kendra

Managed enterprise search with built-in RAG. Abstracts away the vector layer entirely.

  • 40+ native data source connectors
  • Built-in NLP and relevance tuning
  • Pay per query and storage, minimal setup

Quick reference

Timeline 2024 to 2026 Evolution - AWS Vector Store for RAG
Timeline 2024 to 2026 Evolution

Bedrock Knowledge Bases vector store options

Option

Latency

Scale

Cost Profile

Best For

OpenSearch Serverless

Sub-10ms

Billions

OCU hours (~$700/mo min)

High-throughput, low-latency

OpenSearch Managed Cluster

Sub-10ms

Billions

Instance hours

Full control, cost tuning

S3 Vectors

Sub-100ms

2B per index

Storage + queries (no idle)

Cost-optimized, infrequent access

Aurora pgvector

10-100ms

Millions

Instance hours

Hybrid SQL + vector

Pinecone / Redis EC

Sub-10ms

Billions / Millions

Vendor pricing

Third-party, multi-cloud

 

AWS vector search services (Custom RAG pipeline required)

Option

Latency

Scale

Cost Profile

Best For

MemoryDB

Sub-1ms

Millions

Instance hours (memory)

Ultra-low latency, real-time

Neptune Analytics

Sub-second

Billions

Capacity units

GraphRAG, knowledge graphs

DocumentDB

Millisecond

Millions

Instance hours

MongoDB-compatible apps

Kendra

Sub-second

Managed

Pay per query

Enterprise search, minimal setup

 

The right vector store for you – How to choose

1. Decision framework

Decision Flow Chart – How to Choose - AWS Vector Store for RAG
Decision Flow Chart – How to Choose

Start here: do you want Bedrock KB to manage the pipeline? If yes, your options are OpenSearch, S3 Vectors, Aurora pgvector, Pinecone, or Redis EC. If no, the full landscape opens up.

Then ask:

  • What latency do you need?

Sub-millisecond → MemoryDB. Sub-10ms → OpenSearch. Sub-100ms → S3 Vectors. Sub-second → Kendra or Neptune.

  • What are you already running?

Aurora → pgvector. MongoDB → DocumentDB. Redis → MemoryDB. Nothing → OpenSearch or S3 Vectors.

  • Query pattern?

High-QPS → OpenSearch. Infrequent → S3 Vectors. Graph traversals → Neptune. Mixed SQL + vector → Aurora pgvector.

2. Cost comparison

Cost varies significantly across options. The biggest differentiator is whether you pay idle costs when the system is not handling queries. S3 Vectors is the only option with zero idle cost, making it the standout for dev/test and cost-sensitive production. OpenSearch Serverless’s ~$700/month floor is the most common surprise teams hit.

Option

Min Monthly Cost

Pay-per-Query

Idle Cost

Best Cost Profile

OpenSearch Serverless

~$700 (2+2 OCUs)

No

Yes

Predictable production workloads

OpenSearch Managed Cluster

~$50+ (smallest instance)

No

Yes

Cost-tunable, right-size instances

S3 Vectors

$0 (pay only for usage)

Yes

No

Dev/test, infrequent access, large scale

Aurora pgvector

~$50+ (serverless min)

No

Minimal (scales to zero)

Already running Aurora

Pinecone

Varies (pod-based)

No

Yes

Vendor pricing

Redis Enterprise Cloud

Varies (node-based)

No

Yes

Vendor pricing

MemoryDB

~$100+ (smallest node)

No

Yes

Ultra-low latency justifies cost

Neptune Analytics

Capacity unit-based

No

Yes

GraphRAG workloads

DocumentDB

~$50+ (smallest instance)

No

Yes

Already running DocumentDB

Kendra

~$810/month (developer)

Yes

Yes

Enterprise search, many connectors

 

Key takeaway: if cost is your primary constraint, S3 Vectors eliminates idle spend entirely. If you need OpenSearch but want to avoid the serverless minimum, consider a Managed Cluster where you can right-size to a smaller instance.

What hasn’t changed

The vector store matters, but chunking strategy, embedding model selection, and prompt engineering still have a bigger impact on RAG quality. All Bedrock KB backends are swappable without rewriting your application. Pick the one that fits your operational model and spend your energy on retrieval tuning.

Conclusion

Choosing an AWS vector store for RAG is less about features and more about trade-offs. Start with your operational model, then optimise for latency and cost. Most teams will default to OpenSearch or S3 Vectors, but the right answer depends on how much control you need.

Sources

Enjoyed this blog?

Share it with your network!