How AI Can Help You Create Actionable Items for Your Call Recordings

Puneet Punj

22 July, 2024

Recently, I had the chance to delve into Amazon Connect, AWS’s call centre solution, and its powerful AI capabilities through Contact Lens. Intrigued by the potential of AWS’s native AI/ML services, I set out to create my own solution that harnesses the power of Amazon Transcribe, Amazon Comprehend, and Amazon Bedrock.

The result? A tool that converts call recordings into actionable items, ensuring that valuable insights are not overlooked and that timely actions are taken.

Whether you are working with call centre recordings or general meeting audio, this blog will guide you through the process of transforming your raw recordings into meaningful insights that drive results. Join me as we explore how to leverage these AWS (Amazon Web Services) services to elevate your call analysis and enhance decision-making.

Solution Overview

Figure 1 depicts the architecture of the sample solution.

Let us dive into each step in detail.

Call Recording Storage

1. Storing Call Recordings in Amazon S3:

The process begins with storing call recordings in an Amazon S3 bucket. Amazon S3 provides a scalable, secure and durable solution for storing large amounts of data, making it ideal for this purpose. S3 is designed to provide 99.999999999% durability and scales to handle vast amounts of data with high availability. For AWS Connect, it is natively easier to store call recordings in an S3 bucket as it integrates seamlessly with Connect, ensuring secure and efficient storage of call data.

Transcription with Amazon Transcribe

2. Triggering a Lambda Function:

To create an event-driven solution, an AWS Lambda function is triggered when a new call recording is uploaded to the S3 bucket. This can be achieved using S3 event notifications, which invoke the Lambda function in response to the new object creation event.

3. Renaming the File:

It is required to rename the file because Amazon Transcribe does not accept file names with special characters, which might be present in the file names generated by Connect. The Lambda function renames the file to ensure compatibility with Transcribe’s naming conventions.

4. Triggering Another Lambda for the Renamed File:

Once the file is renamed, another Lambda function is triggered for the newly renamed file. This Lambda function is responsible for initiating the transcription process.

5. Calling Amazon Transcribe:

This function calls Amazon Transcribe, a service that converts speech to text using advanced machine learning models. Amazon Transcribe supports multiple languages and dialects, ensuring accurate transcription of your call recordings. The Lambda function sends the audio file to Transcribe and handles the asynchronous nature of the transcription job by monitoring the job status until it completes.

Text Storage

6. Storing Transcribed Text in S3:

The transcribed text is then stored back in another S3 bucket. This step ensures that the text is easily accessible for further processing. S3 provides efficient storage and retrieval of text data, allowing for integration with other AWS services for subsequent processing tasks.

Conversation Formatting

7. Invoking a Second Lambda Function:

A second Lambda function is invoked to process the transcribed text. This Lambda function is designed to separate the conversation into agent and customer parts, which facilitates easier analysis and generation of actionable items. The separation is typically done by identifying speaker labels provided by Transcribe.

8. Processing the Conversation:

The Lambda function processes the conversation, organising it into a structured format that clearly distinguishes between the agent and customer interactions. This structured format is crucial for accurate sentiment analysis and actionable item generation.

Sentiment Analysis with Amazon Comprehend

9a. Sending Formatted Conversation to Amazon Comprehend:

The formatted conversation is sent to Amazon Comprehend for sentiment analysis. Amazon Comprehend uses natural language processing (NLP) to determine the sentiment of each part of the conversation. This service can identify positive, negative, neutral, and mixed sentiments, providing insights into customer satisfaction and agent performance.

9b. Receiving Sentiment Analysis Results:

The sentiment analysis results from Amazon Comprehend are received and stored. These results offer valuable insights into the emotional tone of the conversation, which can be used to improve customer service strategies and agent training programs.

Actionable Items Generation with Amazon Bedrock

10. Using Amazon Bedrock for Actionable Items:

Amazon Bedrock is used to generate actionable items and a summary of the conversation. Bedrock leverages powerful AI models to identify key actions that need to be taken based on the conversation content. This involves extracting key details and recommendations from the transcribed text.

10a. Generating Summary and Actionable Items:

The AI models within Amazon Bedrock analyse the conversation to generate a concise summary and actionable items. These outputs help streamline follow-up actions and enhance the overall customer experience by addressing specific needs and issues discussed during the call.

Output Storage

11. Storing Output in S3:

Finally, the actionable items and conversation summary are stored in an S3 bucket. This makes the output easily accessible for review and further action by your team. Storing the results in S3 ensures they are secure, durable, and readily available for integration with other systems or for manual review.

12. Optional Notification:

Optionally, the output can be sent via email or to a Slack channel based on the company’s use case. This can be achieved using AWS Lambda to integrate with Amazon Simple Email Service (SES) for emails or with Slack APIs for notifications, ensuring that the relevant stakeholders are promptly informed of the results.

Cost

This estimate is a rough calculation and can vary based on the actual usage patterns, optimisations, and specific pricing tiers for different AWS services. Additionally, free tier usage and pricing promotions could reduce these costs.

Costing is based on the Sydney region.

See the calculation details here – https://calculator.aws/#/estimate?id=af90b47a96a7eab6ce098a44882e5b6ac88260cd

Example Calculation

Assume the following scenario for one month:

100 hours of call recordings stored in S3.
Each call recording is 1 hour long (60 minutes) and generates approximately 1 GB of data.
Each hour of audio generates about 60,000 characters of transcribed text.
1,000 API requests to Amazon Bedrock for processing.

S3 Storage Costs

Data Stored: 100 hours * 1 GB = 100 GB
Storage Cost: 100 GB * $0.025/GB = $2.50 per month

Lambda Costs

Assume 10,000 Lambda requests per month (considering multiple Lambda triggers in the workflow).
Request Cost: 10,000 requests = 0 (within free tier)

Amazon Transcribe Costs

6,000 minutes of standard streaming x 60 seconds per minute = 360,000.00 seconds of standard audio streaming
Tiered price for: 360,000.00 seconds of standard audio streaming
360,000 seconds of standard audio streaming x 0.0004 USD = 144.00 USD
Total tier cost = 144.00 USD for standard streaming transcription

Streaming pricing (monthly): 144.00 USD

Amazon Comprehend Costs

10,000 characters per document x 1,000 documents = 10,000,000 characters per request asynchronous
Max (10000000 characters, 300 characters) = 10,000,000.00 characters
NLP requests are measured in units of 100 characters, with a 3 unit (300 character) minimum charge per request. Characters per request is: 10,000,000.00
10,000,000.00 characters / 100 characters = 100,000 units for asynchronous
RoundUp (100000) = 100000 units rounded up to nearest 1 unit asynchronous
Tiered price for: 100,000 units
100,000 units x 0.0001 USD = 10.00 USD

Total tier cost = 10.00 USD for month

Amazon Bedrock Costs

Unit conversions
Number of Input tokens: 1000 thousand per month * 1000 multiplier = 1000000 per month
Number of output tokens: 100 thousand per month * 1000 multiplier = 100000 per month

Pricing calculations

1,000,000 input tokens / 1000 = 1,000.00 K input tokens
1,000.00 K input tokens x 0.0002 USD per K-tokens = 0.20 USD per Month for input tokens
100,000 output tokens / 1000 = 100.00 K output tokens
00 K output tokens x 0.00025 USD per K-tokens = 0.025 USD per Month for output tokens
20 USD + 0.025 USD = 0.225 USD per Month

Total On demand Cost for Titan Life (monthly): 0.23 USD

Total Estimated Monthly Cost

S3 Storage: $2.50
Lambda: $0.00
Transcribe: $144
Comprehend: $10
Bedrock: $0.23

Total: $156.73 USD per month

Solution Output

A sample output file, generated by the above solution and stored in an S3 bucket, appears as shown below.

Conclusion

Integrating Amazon Transcribe, Amazon Comprehend, and Amazon Bedrock into your call centre workflow can revolutionise the way you handle customer interactions. This automated solution not only streamlines the process of transcribing and analysing call recordings but also ensures that critical insights are efficiently extracted and acted upon. By leveraging the power of AWS AI services, you can enhance both customer satisfaction and operational efficiency.

Implementing this system allows your call centre to move from a reactive to a proactive approach, continually improving based on actionable insights. The scalability and reliability of AWS services ensure that your solution grows with your needs, providing a robust foundation for long-term success. Embrace this technology to transform your call centre into a hub of actionable intelligence and stay competitive in an increasingly demanding market.