Virtual assistants have become integral to many applications, offering users seamless interactions and enhancing the experience between the user and the product or company. Building a virtual assistant is an exciting project that involves integrating various technologies to create a system capable of understanding and responding to user commands and queries. In this post, I will cover how to create a virtual assistant with AWS Bedrock and AWS OpenSearch, integrated with AWS Amplify Gen 2.
Architecture Overview
Here are the resources and steps necessary to bring a custom virtual assistant to life.
- User is authenticated to the web application through AWS Cognito.
- After authentication user is requested to prompt his questions to the virtual assistant
- Front-end triggers a GraphQL API query to a Lambda handler.
- Lambda submits the query to a knowledge base via AWS Bedrock API, converting this query into a vector embedding using AWS Bedrock.
- AWS Bedrock converts the query into embeddings and save it into the vector database.
- By using the new query embedded in the OpenSearch vector database. AWS Bedrock retrieves the most relevant documents stored on its DataSource (S3 bucket) based on their semantic similarity to the query.
- AWS Bedrock now generates a detailed response to the user based by the retrieved
Understanding the process
Before diving into the construction of the virtual assistant, let’s take two steps back to understand some concepts and how Bedrock works with OpenSearch.
What is AWS Bedrock?
AWS Bedrock is a fully managed service that helps developers and data engineers build, train and deploy large-scale foundation models with ease. It provides access to a suite of Large Language Models (LLMs) and Generative AI tools, allowing developers to build intelligent applications without the necessity to train their own models from scratch. Some of its use cases are creation of tools for text generation or summarization content, text and image search, image generation and the one I will be approaching today which is virtual assistants.
If you want to follow more about AWS Bedrock, please check its official website: AWS_Bedrock
What are AWS Bedrock Knowledge Bases?
AWS Bedrock Knowledge Base helps you store and access a collection of information from a pointed Data Source. When you ask a question, it uses AI to find the best answers from this Data Source, combining retrieval of relevant data with smart response generation to provide accurate and helpful replies.
What is Retrieve Augmented Generation (RAG)?
RAG is a capability from AWS Bedrock designed to combine information retrieval and generation functionalities using large language models (LLMs). First, it retrieves information from a knowledge base or data source, then, it uses LLMs to generate coherent and contextually accurate responses based on the retrieved information. And such technique is what we will be using to build a custom virtual assistant.
How does AWS Bedrock work with Vector Database and AWS OpenSearch?
Integrating RAG capability with a vector database from OpenSearch, it can significantly enhance the performance and relevance of the information retrieved plus the response generated for the end user. Here is how integration works.
- Knowledge Base fetches the data from its Data Source (in our case a S3 Bucket)
- AWS Bedrock divides the response into blocks of text and generates vector embeddings for each block. Bedrock stores and manages the embeddings, and ensures the data is constant in sync with the vector store.
- Then these embeddings are stored in a vector database, provided by AWS OpenSearch. This setup allows you to efficiently query and retrieve data based on semantic similarity.
- Indexing: Index the original documents along with their vector embeddings in OpenSearch. Ensure that each document has a unique identifier to link it with its embedding.
- OpenSearch then returns the top K documents that are most like the query embedding. These documents are the most contextually relevant to the user’s query.
- Delivering Response: AWS Bedrocks uses foundational models to generate a comprehensive and relevant answer of the retrieved documents from the OpenSearch queries.
That’s what happens under the hood. Although AWS Bedrock facilitates the process creation as I demonstrate in the next steps.
1 – AWS Bedrock configuration
On this step we need to configure the Bedrock Knowlege Base, its Data Source and the Serverless Vector Database in OpenSearch that will store the generated embeddings.
Note: “Since the goal of this solution is to provide an end-to-end application for your virtual assistant, I decided to create the knowledge base directly in the AWS console. Below I show the steps required in doing so. However, nothing stops you to create these resources also on CDK with your Amplify Gen 2, to learn more how to create custom resources with Amplify please check it out this link: AmplifyGen2_CustomResources”
Note 2: “Bedrock offers several embeddings models, and some are available in few AWS Regions so far, so for the purpose of this case I will be creating the resource on us-east-1 region”
DataSource & AWS Bedrock configuration on console
The first step on the configuration is to store the custom data that will be used for the DataSource and how to link to a Knowledge Base.
- Create a S3 bucket in your account
- Upload the following AWS documents on it
- Create a Knowledge Base on Bedrock
- On Bedrock, creates a knowledge base
- Assign a name and description for the knowledge base
- For IAM permissions, select Create and use a new service role
- For Data Source choose Amazon S3, click Next
- For S3 URI browse and select the S3 bucket created on step 1
- For Embeddings Model, choose Titan Embeddings G1 – Text 1.2
- For Vector database, choose Quick create a new Vector store
- Create Knowledge Base
- After creation save the Knowledge Base Name, Base Id
2 - A serverless virtual assistant
By using serverless resources, our focus would be the functionality of the virtual assistant, and more importantly, leverage serverless benefits such as automatic scaling and high availability. To create such a solution, I decided to use Amplify Gen 2, a relatively new resource that uses Typescript DX, allowing developers to build full-stack cloud applications using just Typescript. I recently wrote another blog about Amplify Gen 2, in which I list its benefits and show how to deploy a full-stack application with custom resources. You can check it out in CevoPost_FullStack_AmplifyGen2
To speed things up let’s use Amplify-Quickstart to setup the repository, initial code and deployment. This template provides a TODO list creator application with Authentication and Database integration. For our tutorial, we will customize this template to be used in the Virtual assistant with React and Material UI. After following the steps from the Quickstart example you should be able to run localhost and verify TODO list application provided.
Back-end Setup
Data and GraphQL query customisation
With Amplify Data, you can build a secure, real-time GraphQL API integrated with a Lambda function using Typescript. Amplify will deploy the API powered by AWS AppSync and connects to your Lambda Function. We also can ensure API authentication rules with Cognito.
import { type ClientSchema, a, defineData } from '@aws-amplify/backend';
import { submitPromptFunction } from '../functions/submit-prompt/resource';
export const schema = a.schema({
RetrievalResultLocation: a.customType({
s3Location: a.customType({
uri: a.string()
}),
type: a.string(),
webLocation: a.customType({
url: a.string()
})
}),
RetrievedReferencesResponse: a.customType({
contenxt: a.customType({
text: a.string()
}),
location: a.ref('RetrievalResultLocation'),
metadata: a.string()
}),
GeneratedResponsePart: a.customType({
textResponsePart: a.customType({
span: a.customType({
end: a.integer(),
start: a.integer()
}),
text: a.string()
})
}),
CitationResponse: a.customType({
generatedResponsePart: a.ref('GeneratedResponsePart'),
retrievedReferences: a.ref('RetrievedReferencesResponse').array()
}),
PromptResponse: a.customType({
type: a.string(),
sessionId: a.string(),
systemMessageId: a.string(),
systemMessage: a.string(),
sourceAttributions: a.ref('CitationResponse').array()
}),
submitPrompt: a
.query()
.arguments({
userId: a.string().required(),
prompt: a.string(),
messageId: a.string(),
sessionId: a.string()
})
.returns(a.ref('PromptResponse'))
.handler(a.handler.function(submitPromptFunction))
.authorization(allow => [allow.authenticated()])
});
export type Schema = ClientSchema;
export const data = defineData({
schema,
authorizationModes: {
// This tells the data client in your app (generateClient())
// to sign API requests with the user authentication token.
defaultAuthorizationMode: 'userPool',
// API Key is used for a.allow.public() rules
apiKeyAuthorizationMode: {
expiresInDays: 30
}
}
});
By the piece of code which specifies submitPrompt query, we are also defining the lambda function handler to be associated with the GraphQL query, thus naturally, the next step we need to do is to create the Submit Prompt Function.
Function and logic
Amplify Functions connects AWS Lambda and can respond to events from other resources, for our case a GraphQL query. On the code below we will parse the user query prompt to AWS Bedrock to retrieve the augment search and generate a custom response back to the prompt.
First, we created a service to interact with Bedrock client via aws-sdk.
amplify/custom/src/lambda/prompt/service.ts
import { isEmpty } from 'lodash-es';
import { CitationResponse, PromptReponse, PromptRequest } from './model';
import {
BedrockAgentRuntimeClient,
RetrieveAndGenerateCommand,
RetrieveAndGenerateCommandInput,
RetrieveAndGenerateCommandOutput
} from '@aws-sdk/client-bedrock-agent-runtime';
interface PromptServiceProps {
bedrockKnowledgeBaseId?: string;
bedrockModelName?: string;
region: string;
}
export const PromptService = (props: PromptServiceProps) => {
const bedrockKnowledgeBaseId = props?.bedrockKnowledgeBaseId;
if (!bedrockKnowledgeBaseId)
throw new Error('No bedrockKnowledgeBaseId name given or missing BEDROCK_KNOWLEDGE_BASE_ID value');
const bedrockModelName = props?.bedrockModelName;
if (!bedrockModelName) throw new Error('No bedrockModelName name given or missing BEDROCK_MODEL_NAME value');
const { region } = props;
const bedrockClient = new BedrockAgentRuntimeClient({ region });
const modelArn = `arn:aws:bedrock:${region}::foundation-model/${bedrockModelName}`;
const sourceType = 'BEDROCK_KNOWLEDGEBASE';
const submitPrompt = async (request: PromptRequest) => {
console.log('Submitting prompt to Bedrock', JSON.stringify(request));
let input: RetrieveAndGenerateCommandInput = {
input: {
text: request.prompt
},
retrieveAndGenerateConfiguration: {
type: 'KNOWLEDGE_BASE',
knowledgeBaseConfiguration: {
knowledgeBaseId: bedrockKnowledgeBaseId,
modelArn,
retrievalConfiguration: {
vectorSearchConfiguration: {
overrideSearchType: 'HYBRID'
}
}
}
}
};
if (request.sessionId !== '') {
input = {
...input,
sessionId: request.sessionId
};
}
const command = new RetrieveAndGenerateCommand(input);
const response: RetrieveAndGenerateCommandOutput = await bedrockClient.send(command);
let serviceResponse: PromptReponse = {
type: sourceType
};
console.log('Response Bedrock', JSON.stringify(response));
if (response) {
const { citations, output, sessionId } = response;
let sourceAttributions: Array = [];
if (!isEmpty(response.citations)) {
sourceAttributions =
citations?.map(item => {
return {
generatedResponsePart: {
textResponsePart: { ...item.generatedResponsePart?.textResponsePart }
},
retrievedReferences: item.retrievedReferences?.map(rr => {
return {
content: { ...rr.content },
location: {
s3Location: { ...rr.location?.s3Location },
type: rr.location?.type,
webLocation: { ...rr.location?.webLocation }
},
metadata: rr.metadata
};
})
};
}) || [];
}
serviceResponse = {
type: sourceType,
sessionId: sessionId,
systemMessageId: sessionId,
systemMessage: output?.text,
sourceAttributions
};
}
return serviceResponse;
};
return { submitPrompt };
};
Now, the Lambda handler which requests the service. Remember the Base ID.
amplify/function/submit-prompt/handler.ts
import { PromptService } from '../../custom/src/lambda/prompt/service';
import { Schema } from '../../data/resource';
import { env } from '$amplify/env/devSubmitPrompt'; // the import is '$amplify/env/'
type Handler = Schema['submitPrompt']['functionHandler'];
export const handler: Handler = async event => {
const { userId, prompt, messageId, sessionId } = event.arguments;
if (!userId) {
console.log('userId not found');
throw new Error('User id not found');
}
if (!env.BEDROCK_KNOWLEDGE_BASE_ID || !env.BEDROCK_MODEL_NAME) {
console.log('envVars not found');
throw new Error('BEDROCK environment vars not found');
}
const promptService = PromptService({
bedrockKnowledgeBaseId: env.BEDROCK_KNOWLEDGE_BASE_ID,
bedrockModelName: env.BEDROCK_MODEL_NAME,
region: env.AWS_REGION || 'us-east-1'
});
const response = await promptService.submitPrompt({
userId,
prompt: prompt || undefined,
messageId: messageId || undefined,
sessionId: sessionId || undefined
});
console.log(`response: ${JSON.stringify(response)}`);
return response;
};
Some important information to note here, to create a Bedrock client, you need to specify the Knowledge Base Id, Region, and the Foundation Model Name which will be used to generate the custom response based on embeddings findings. The region and Base Id we already defined earlier when the Knowledge Base was created. Now for the foundation model, we will be using anthropic.claude-v2
. And as you can see below, such values are passed to the lambda via environment variables.
amplify/function/submit-prompt/resource.ts
import { defineFunction } from '@aws-amplify/backend';
import { BACKEND_CONFIG } from '../../constants';
const lambaEntryPrefix = 'amplify/functions';
const submitPromptName: string = 'devSubmitPrompt';
export const submitPromptFunction = defineFunction({
name: submitPromptName,
environment: {
BEDROCK_KNOWLEDGE_BASE_ID: BACKEND_CONFIG.BEDROCK_KNOWLEDGE_BASE_ID,
BEDROCK_MODEL_NAME: BACKEND_CONFIG.BEDROCK_FOUNDATION_MODEL_NAME
},
entry: './handler.ts',
runtime: 18,
timeoutSeconds: 900
});
Amplify Backend
The last piece of the backend configuration is to add the function to the Amplify backend functionality. Also, since the lambda function will interact with Bedrock, we need to set the appropriate permissions through a IAM Policy for the function to be able to properly work.
import { defineBackend } from '@aws-amplify/backend';
import { auth } from './auth/resource';
import { data } from './data/resource';
import { submitPromptFunction } from './functions/submit-prompt/resource';
import { Effect, PolicyStatement } from 'aws-cdk-lib/aws-iam';
import { BACKEND_CONFIG } from './constants';
const backend = defineBackend({
auth,
data,
submitPromptFunction
});
// Create a new API stack
// const apiStack = backend.createStack('api-stack');
// Create the policy for Lambda to access Bedrock
const bedrockPolicyStatement = new PolicyStatement({
effect: Effect.ALLOW,
actions: [
'bedrock:DescribeKnowledgeBase',
'bedrock:InvokeModel',
'bedrock:ListKnowledgeBases',
'bedrock:Retrieve',
'bedrock:RetrieveAndGenerate',
'bedrock:RetrieveAndGenerateCommand',
'logs:CreateLogGroup',
'logs:CreateLogStream',
'logs:PutLogEvents'
],
resources: [
`arn:aws:bedrock:${BACKEND_CONFIG.REGION}:${BACKEND_CONFIG.AWS_ACCOUNT_ID}:knowledge-base/${BACKEND_CONFIG.BEDROCK_KNOWLEDGE_BASE_ID}`,
`arn:aws:bedrock:${BACKEND_CONFIG.REGION}::foundation-model/${BACKEND_CONFIG.BEDROCK_FOUNDATION_MODEL_NAME}`
]
});
// Add to amplify function the role that grants to invoke to Bedrock
const submitPromptLambda = backend.submitPromptFunction.resources.lambda;
submitPromptLambda.addToRolePolicy(bedrockPolicyStatement);
Front-end setup
With back-end defined, Submit Prompt GraphQL API can now be consumed by the front-end. But first you need to generate the Schema client.
import { generateClient } from "aws-amplify/data";
import type { Schema } from "../amplify/data/resource";
export const clientSchema = generateClient();
Chatbot Component
This is the main component that sends requests to backend trough the generated schema client and displays AWS Bedrock responses.
import { ChangeEvent, SyntheticEvent, useEffect, useRef, useState } from 'react';
import {
ButtonDiv,
ChatIconMessage,
ChatMessageDiv,
ChatbotArea,
ChatbotTextArea,
InputAreaDiv,
LeftMessageDiv,
LoadingDiv,
RightMessageDiv,
StyledButton,
StyledInput,
UserMessageDiv,
Wrapper
} from './styled';
import { CircularProgress } from '@mui/material';
import ChatbotImage from '../chatbot-image';
import { getCurrentUser } from 'aws-amplify/auth';
import { clientSchema } from '../../utils';
import { v4 as uuidv4 } from 'uuid';
const Chatbot: React.FC = () => {
const inputRef = useRef(null);
const [chatAreaState, setChatAreaState] = useState>([]);
const [inputValue, setInputValue] = useState('');
const [loading, setLoading] = useState(false);
const [chatLastMessageId, setChatLastMessageId] = useState('');
const [chatSessionId, setChatSessionId] = useState('');
const initialAssistantMessage = 'How can I help you today?';
const assistantRowComponent = (assistMessage: string | undefined, displayIcon?: boolean) => {
const newUuid = uuidv4();
return (
{displayIcon && (
)}
Assistant:
{assistMessage || 'Sorry, could not get a answer for you. Please try again with a different question.'}
);
};
const userRowComponent = (userMessage: string) => {
const newUuid = uuidv4();
return (
User:
{userMessage}
);
};
useEffect(() => {
const initAreaState: Array = [assistantRowComponent(initialAssistantMessage, true)];
setChatAreaState(initAreaState);
if (inputRef.current) {
inputRef.current.focus();
}
}, []);
function handleChange(event: ChangeEvent) {
setInputValue(event.target.value);
}
async function submitQuery() {
const { username } = await getCurrentUser();
const query = {
prompt: inputValue,
userId: username,
messageId: chatLastMessageId || '',
sessionId: chatSessionId || ''
};
const response = await clientSchema.queries.submitPrompt(query);
return response.data;
}
async function handleSubmit(event: SyntheticEvent) {
if (inputValue !== '') {
// Display user message
const currentAreaState = chatAreaState;
currentAreaState.push(userRowComponent(inputValue));
setChatAreaState(currentAreaState);
setInputValue('');
event.preventDefault();
// Display Result
try {
setLoading(true);
const response = await submitQuery();
if (response) {
setChatLastMessageId(response?.systemMessageId || '');
setChatSessionId(response.sessionId || '');
currentAreaState.push(assistantRowComponent(response.systemMessage || ''));
setChatAreaState(currentAreaState);
}
} catch (e) {
console.error('Request failed: ', e);
currentAreaState.push(assistantRowComponent(undefined));
} finally {
setLoading(false);
setChatAreaState(currentAreaState);
if (inputRef.current) {
inputRef.current.focus();
}
}
}
}
return (
{chatAreaState}
<>
{loading ? (
) : (
handleSubmit(event)}>Submit
)}
>
);
};
export default Chatbot;
Testing the Virtual Assistant
After everything is built and deployed, we finally test the application. And since our DataSource is limited, you will need some questions relevant to the subjects. Here are some examples to start with, but feel free to explore the AWS whitepaper documents to input more questions to the virtual assistant.
- What the AWS Well-Architected Review framework document?
- What are the best practices of the AWS Well-Architected Review?
Testing the Virtual Assistant
As you can see with a few steps, we were able to create a full-stack experience with an AI Virtual assistant with a custom data source. And leverage the AWS Amplify Gen 2 benefits when it comes to easy deployment and validation.
The rest of the code of this application is divided into different components to present a better UI experience for the user, I decided to not post all the code here otherwise this post would be much long. But you can check it out the whole code on its GitHub repo on: