Retrieval Augmented Generation (RAG) Options in AWS

In 2006, Jeff Bezos, in MIT’s Emerging Technologies Conference, declared: “We make muck, so you don’t have to”. This was in reference to many tech companies in the day having up to 70% of their companies’ resources doing technology-related heavy lifting, instead of focusing on activities that differentiate themselves. This was the year when AWS introduced S3 and EC2, and we all have watched its evolution all these years.  

Here we are, 18 years later, and AWS still has the same tricks up their sleeves. With Generative AI adoption going from strength to strength since OpenAI brought it to mainstream towards the end of 2022, AWS continues to build the muck, so that we do not have to. 

In this article, we will be looking at options for building Generative AI applications and how AWS helps in doing most of the heavy lifting for us.

Build Your Own

Building applications based on Large Language Models (LLMs) requires careful consideration of several factors to ensure success. One essential component is supplying context through the Retrieval-Augmented Generation (RAG) pattern. This approach enables LLMs to leverage external knowledge sources, enhancing the accuracy and relevance of generated responses. However, customers face the critical task of selecting the best LLM for their specific needs. With many options available, from GPT to Llama to Claude to Gemini and beyond, understanding each model’s strengths and weaknesses is crucial for making informed decisions. 

 
Moreover, the infrastructure supporting LLMs must be robust and well-designed. Vector stores play a vital role in storing and retrieving contextual information for the models. Selecting the most suitable vector store from the array of available options is essential for optimising performance and scalability. Additionally, navigating the complexities of LLM orchestrations and library options can be daunting. Whether utilising existing libraries or building a custom solution, developers must address scalability, security, and integration challenges. From designing data ingestion pipelines to fine-tuning prompt engineering, every aspect demands meticulous attention.

Advantages: 

  • Building your RAG solution allows for complete customisation to suit your specific requirements and preferences. You have full control over the architecture, algorithms, and components of the system, enabling you to tailor it precisely to your needs. 
     
  • Building from scratch allows you to develop domain-specific solutions tailored to the unique requirements of your industry or application. You can incorporate domain knowledge, specialised algorithms, and custom features to address specific challenges and deliver better results. 

 

Disadvantages: 

  • Building a RAG solution entail dealing with technical complexities across various domains, including natural language processing, information retrieval, and distributed systems. It requires expertise in multiple disciplines and may pose challenges for teams without sufficient experience or skills. 
     
  • Ensuring scalability and performance efficiency in a custom-built RAG solution can be challenging. Scaling up to handle increasing data volumes, user loads, or computational demands requires careful design and optimisation of algorithms, infrastructure, and architecture. 
     
  • Building a RAG solution from scratch may introduce security vulnerabilities if not properly designed and implemented. Handling sensitive data, securing communication channels, and protecting against malicious attacks require thorough attention to security best practices throughout the development lifecycle. 

 

Navigating the complexities of constructing your LLM-based application reveals numerous challenges and AWS has all the building blocks for this to be created. However, it is good to know that Amazon also offers higher-level solutions that handle much of the heavy lifting to make this happen. 

In this article, we delve into two such offerings: Knowledge Bases for Amazon Bedrock and Amazon Q. 

Knowledge Bases for Amazon Bedrock

Knowledge bases for Amazon Bedrock is a new AWS offering that allows organisations to securely connect foundation models to internal company data sources, enhancing the capabilities of the models. It was in preview in September 2023, and quickly became generally available in November that same year. 

Amazon Bedrock’s Knowledge Base service manages the entire workflow of ingesting data into a vector database from various sources. This includes setting up data pipelines, preprocessing and embedding the data. The database is then used to retrieve relevant results for user queries to the foundation model. 

Some key benefits of using Knowledge Bases for Amazon Bedrock are: 

  • It implements the full RAG workflow without requiring custom integration to data sources or managing the data ingestion process.
  • Models can leverage proprietary company information to increase accuracy and context in their responses.
  • Popular vector databases like Amazon OpenSearch Serverless, Pinecone and Redis Enterprise Cloud are supported.
  • The service handles securely connecting models to data sources and retrieving information to augment prompts.
  • Once your knowledge base is set up, you can interact with it using the AWS SDK for Python (Boto3). This opens it to connect to web applications, mobile applications, bots, dashboards, and command line interfaces. 

The knowledge base simplifies connecting applications to different data sources for retrieval augmented generation. Developers can choose interfaces based on their use cases, favouring options like web, mobile, bots etc. depending on where the enhanced conversational capabilities are most needed.

Knowledge Bases for Amazon Bedrock are great, but they do have some limitations: 

  • There is a restricted selection of embedding models, all of which are text-based: Titan G1 Embeddings, Text Cohere Embed English, and Cohere Embed Multilingual.
  • Vector Stores offer only a handful of choices: Amazon OpenSearch Serverless, Amazon Aurora, Pinecone, and Redis Enterprise Cloud.
  • Your data sources need to reside in Amazon S3.
  • Because Knowledge Bases are more generic services that are designed to be consumed by other services and applications, you must build your own user interface around it.
  • Because it performs many of the heavy lifting involved in any RAG-based application, it is not as flexible as when you do everything by yourself. For example, you cannot choose the underlying LLM foundation model  
     

With these drawbacks, such as the limited selection of embedding models, vector stores, and the inability to select an LLM to use, these might be temporary as the service matures and might be available later.  

Amazon Q – Gen AI powered assistant (Preview)

In November 2023, AWS announced the Preview of Amazon Q, a new generative AI-powered assistant.

Amazon Q is a new type of generative AI–powered assistant. It is specifically for work and can be tailored to your business to have conversations, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, code bases, and enterprise systems. Amazon Q provides quick, relevant, and actionable information and advice to help streamline tasks, speed decision-making and problem-solving, and help spark creativity and innovation at work. 

Amazon Q abstracts away many of the operations in a typical RAG application. You do not have to burden yourself with many options such as embedding model selection, choice of LLM orchestration, vector DB selection, LLM Foundation model selection, and the list goes on.  

Starting with Amazon Q is a simple 2-step process: 

Step 1: Select your retriever

A retriever fetches documents stored in a search index. When a user asks a question, the retriever finds documents relevant to the query which is then used to generate answers.

You can choose to build a new Amazon Q retriever using the Amazon Q infrastructure (native retriever). Or you can use an existing Amazon Kendra index as a retriever. If you choose to build a native retriever using Amazon Q, an Amazon Q index is automatically created for you.

When building your own RAG system from scratch, you select and deploy your own vector store, in Amazon Q, they have a concept of retrievers that abstract away a serverless vector store, so you have less to worry about. 

Step 2: Select your data source

A data source is a data repository where your documents are stored. To help you implement a generative AI web experience for your end users, Amazon Q connects directly and securely to supported data sources. Up to 5 data sources can be configured per application. 

The main data sources are Amazon S3, Websites and your own uploaded files, however, Amazon Q has a long list of data source connectors such as Google Drive, Confluence, Jira, Slack, to name a few.  

Once your data sources are configured, Amazon Q provides a user-friendly web interface, empowering your users to ask questions directly against your organisation’s data.

Advantages: 

  • Easy no-code solution to have an LLM-powered chatbot against your organisation’s data. This improves the accessibility of the application and opens it up to more users, even those that do not have programming knowledge.
  • Many data source options to choose from, which ensures the integration and connectivity to many diverse sources of data, enabling better insights.
  • Rapid time to market, which is essential for staying ahead of the competition, meeting customer expectations, reducing costs, and maximising revenue potential in the fast-paced world of app development
  • Because Amazon Q is a managed solution, it scales as your user base grows

 

Disadvantages: 

  • Less flexibility of the type of applications you can build, currently, Amazon Q only supports Q&A applications
  • The pricing model is per-user basis, so the more users you have using the application, the more expensive it gets
  • The web experience has limited customisability. When you require a custom web or mobile interface, then Amazon Q may not be what you need
  • One cannot choose the underlying LLM being used, as this choice has been abstracted from us.  

RAG Options in AWS

To summarise, the RAG Options in AWS are Build Your Own, Knowledge Bases for Amazon Bedrock, and Amazon Q. Please see the table below to understand the pros and cons for each: 

(Note that the data below are of the time of writing this article, and AWS features can change at any time)

 

Build Your Own 

Knowledge Bases for Amazon Bedrock 

Amazon Q (Preview) 

Application type 

Any 

Any 

Q&A chat over text 

Pricing 

On-demand, Pay-as-you-go 

On-demand, Pay-as-you-go 

Per-user 

Embedding model 

Any 

A few options 

Built-in 

Data chunking 

Any 

Limited 

Built-in 

Flexibility 

High 

Mid 

Fixed 

LLM choice 

Any 

No 

No 

Managed 

No 

Yes 

Yes 

Multi-modality 

Yes 

Text only 

Text only 

Retrieval method 

DIY 

built-in 

built-in 

Scalability 

DIY 

Yes 

Yes 

Security 

DIY 

Yes 

Yes 

Similarity method 

DIY 

Built-in 

Built-in 

User experience 

DIY 

DIY 

Built-in 

Vector Database 

DIY 

A few options 

Built-in 

Conclusion

In summary, as technology keeps advancing, it is crucial for companies to work smarter, not harder. AWS has made it easier for businesses to build LLM applications without needing to build everything from scratch. They are clearing away the technical “muck” so that companies can focus on what makes them special.

Nowadays, with tools like Knowledge Bases for Amazon Bedrock and Amazon Q (Preview), it is much simpler to build smart chatbots that work for your organisation. These tools connect easily with your data, making it easier to get useful insights without getting bogged down in complicated details.

While AWS certainly enables building applications from the ground up, it also offers these high-level services which handle many of the minute details. Throughout the years, AWS’s mission remains unchanged: to assist businesses in navigating the ever-changing tech landscape without drowning in technical complexities. Ultimately, it’s about leveraging technology to simplify processes and keep businesses ahead of the curve. 

Enjoyed this blog?

Share it with your network!

Move faster with confidence