Transforming Data Engineering with DevOps on the Databricks Platform

The role of the Data Engineer is rapidly changing, from writing ETL scripts to engineering production-grade data products. On the Databricks Lakehouse Platform, this shift demands more than technical know-how; it requires a DevOps mindset. By embracing software engineering best practices, automated testing, and CI/CD pipelines, data teams can deliver scalable, reliable, and secure solutions. This blog explores how DevOps principles and tools like Git Folders and Databricks Asset Bundles are transforming data engineering into a discipline of continuous innovation and delivery.
Data Strategy Diagnostic: Building a Robust Data Strategy on AWS

Data is everywhere but without a clear strategy, it can hold your business back instead of moving it forward. In this blog, we share how the AWS Data Strategy Diagnostic helps organisations assess their data maturity, uncover gaps, and build a practical roadmap to turn data into real business value.
Prioritising Data Quality with dbt-expectations: A Practical Approach to Building Reliable Data Pipelines

Discover how dbt-expectations enhances data quality checks within dbt pipelines, ensuring reliable analytics and streamlined workflows.
Supercharge your Data Validation: Using CTEs and Pandas to seamlessly compare CSV data with Postgres RDS

Learn to use CTEs and Pandas to efficiently compare CSV data with a PostgreSQL database without loading all data into memory.
Serverless Solution for RDS Granular Database Backup – Part 1

This blog explores a serverless AWS solution designed to address the challenges of conducting efficient granular backups of RDS databases, offering insights into its architecture and benefits.
Retrieval Augmented Generation (RAG) Options in AWS

Cevo Consultant JO Reyes explores options for building Generative AI applications using Amazon Web Services (AWS).
Deploy ETL Data Pipelines in Amazon Web Services using Azure DevOps

Discover streamlined ETL pipeline deployment in AWS using Azure DevOps, ensuring reliability and efficiency for data-driven decision-making.
Multi-account Deployment of An Open-Source Vector Database on AWS

Learn how deploying Weaviate on AWS EKS across multiple accounts enhances security, control and innovation while safeguarding operations.
Deploying an Open-Source Vector Database on AWS – Part 2

In this lab, we will create a Kafka producer application using AWS Lambda, which sends fake taxi ride data into a Kafka topic on Amazon MSK. A configurable number of the producer Lambda function will be invoked by an Amazon EventBridge schedule rule. Therefore, we are able to generate test data concurrently based on the desired volume of messages.
Deploying an Open-Source Vector Database on AWS – Part 1

In this lab, we will create a Kafka producer application using AWS Lambda, which sends fake taxi ride data into a Kafka topic on Amazon MSK. A configurable number of the producer Lambda function will be invoked by an Amazon EventBridge schedule rule. Therefore, we are able to generate test data concurrently based on the desired volume of messages.