Happy New Year 2024 to everyone! At Cevo, we have been busy making some big impacts in the AI space. In fact, we had the opportunity to engage to various customers who were looking to deploy an open-source vector database on AWS to power some of their critical business applications using large language models.
In this blog post, we discuss the various points considered for the implementation of Weaviate vector database on EKS via a multi-account environment explored during various workshops with our customers. A multi-account environment is important not only to improve governance but also to increase the security and control of resources that support any business operation. This strategy allows different internal teams to experiment, test and deploy faster, while keeping their core operations safe and available for their customers.
During our conversations, we covered the following items:
- What is Weaviate?
- Deployment Options.
- AWS Marketplace Deployment (SaaS).
- Own-Instance Deployment on AWS (PaaS).
- Multi-account deployment strategy.
1. What is Weaviate?
Weaviate is an open-source low-latency vector database with out-of-the-box support for multimodal media types (text, images, etc.). The database stores both objects and vectors, allowing for combining vector search with structured filtering and the fault tolerance of a cloud-native database. All are accessible through a wide variety of client-side programming languages.
Some key attributes of Weaviate are:
- End-to-end vector database for vector similarity search, hybrid search, and advanced filtered search.
- Optional integrations with SageMaker, Bedrock, Cohere, HuggingFace, and many others.
- Suited for vector search, retrieval augmented generation (RAG), and generative search.
2. Deployment Options
With AWS, developers can build complex large language models (LLM) pipelines on top of their own text databases, using state-of-the-art tools: from conversational AI to semantic search and summarisation. One of the most talked about architectures these days is RAG, which stands for retrieval augmented generative AI. RAG pipelines combine the power of a generative LLM with the insights contained in your data, to create truly helpful user interfaces.
In the dynamic landscape of cloud computing, businesses face critical decisions when hosting their RAG system for Generative-AI applications on AWS – AWS Marketplace Software as a Service (SaaS) or Self-Hosted.
Let’s briefly define what they are:
- AWS Marketplace solution, a digital repository of third-party software, offers a convenient gateway to access a variety of third-part software applications with streamlined deployment processes to run on the AWS Cloud.
- Own-Instance solution provides organisations with unparalleled control over their infrastructure, allowing for tailored configurations to meet specific needs.
Understanding the strengths and considerations of each approach is pivotal for businesses seeking the optimal strategy for their vector database deployment on the AWS cloud.
3. AWS Marketplace Deployment
AWS Marketplace is an online store that makes it easy for customers to find, buy, and deploy software solutions that run on Amazon Web Services (AWS). The AWS Marketplace offers a wide selection of commercial and free software, including applications, tools, and services that are pre-configured for AWS. These offerings come with a complete package so that a complete working solution can be deployed in one go. There is limited or no flexibility in reusing your custom deployed resources. This also means that they may not be completely complaint with the required security standards of your environment.
When evaluating the different pros and cons of this deployment option, we identified the points below:
4. Own-Instance Deployment on AWS
Below are the considerations when using your own instance deployment using your own resources like VPCs, Security Groups, IAM Roles and EKS cluster.
5. Multi-account strategy deployment
The following diagram shows a reference architecture of a target landing zone we proposed to a customer with various deployed components.
For context, we engaged with this customer to build a new landing zone following the well-architected framework from AWS. Below are the outlines of this architecture
- The whole landing zone is setup using AWS Landing Zone Accelerator (LZA).
- The Landing Zone Accelerator deploys the landing zone through a set of pipelines. In this architecture the pipelines are run from Pipeline OU/Account in this architecture. This is done using external pipeline deployment feature of LZA.
- There is an IPSec VPN in the architecture which may be required if there are separate AWS accounts that need to communicate with the Weaviate database that is running in our environment. This might be needed if Weaviate is running in the new landing zone while the older applications need to communicate with Weaviate but are still not migrated to the new landing zone.
With the goal of implementing best practices in their new environment, this solution implements two instances of Weaviate via EKS running on two account groups:
- Non-Production – This account is used by data engineers, data scientists and ML (Machine Learning) engineers to perform experimentation, development, validation, and deployment where the automated unit and integration tests are run.
- Production – Fully tested and approved models from the non-production accounts are deployed to the production account for both online and batch inference.
The table below outlines our decision-making approach:
(Management Skills in the above table focuses on the blueprint or source code template of the EKS cluster.)
6. Concluding Remarks
In this blog post, we navigated the nuances of deploying Weaviate vector database on EKS within a multi-account environment. Beyond governance, this approach enhances security and control, enabling internal teams to innovate swiftly while safeguarding core operations.
Our discussions covered Weaviate fundamentals, deployment options, and the pivotal role of a multi-account strategy.
In the evolving AI space, Cevo remains committed to forefront innovation. Here’s to a year of embracing transformative methodologies and thriving in the AI realm.
Cheers to progress and possibilities in 2024!