Orchestrating Snowflake Data Transformations with DBT on Amazon ECS through Apache Airflow- Part 1
We explore how AWS CloudWatch math metrics provide a powerful way to derive insights and take automated actions based on custom calculations.
Real Time Streaming with Kafka and Flink – Introduction
This series updates a real time analytics app based on Amazon Kinesis from an AWS workshop. Data is ingested from multiple sources into a Kafka cluster instead and Flink (Pyflink) apps are used extensively for data ingesting and processing. As an introduction, this post compares the original architecture with the new architecture, and the app will be implemented in subsequent posts.
Exploring The Power of Vector Databases (Part 2)
This blog explores how we can use vector databases to keep LLM knowledge up to date, minimise hallucinations and enhance user experiences.
Dynamic Table Usage in Snowflake: Implementing Type 2 Slowly Changing Dimensions (SCD) with Flexibility and Efficiency
We explore how AWS CloudWatch math metrics provide a powerful way to derive insights and take automated actions based on custom calculations.
Vector Databases: The What, The How and The Why
In this blog, we explore the transformative power of modern data solutions, and the benefits they can provide organisations.
Kafka, Flink and DynamoDB for Real Time Fraud Detection – Part 2 Deployment via AWS Managed Flink
This series re-implements a simple fraud detection application that is discussed in an AWS workshop titled AWS Kafka and DynamoDB for real time fraud detection. In part 1, I demonstrated how to develop the application locally, and the app will be deployed via Amazon Managed Service for Apache Flink in this post.
The transformative power of modern data solutions
In this blog, we explore the transformative power of modern data solutions, and the benefits they can provide organisations.
Snowflake Dynamic Data Masking: Enhancing Data Security and Compliance
In this blog, Jayaananth Jayaram highlights how both EMR Serverless PySpark jobs on MWAA can revolutionise big data processing and analysis.
Building Serverless PySpark Jobs with EMR-Serverless and MWAA
In this blog, Jayaananth Jayaram highlights how both EMR Serverless PySpark jobs on MWAA can revolutionise big data processing and analysis.
Kafka, Flink and DynamoDB for Real Time Fraud Detection – Part 1 Local Development
Apache Flink is widely used for building real-time stream processing applications. On AWS, Kinesis Data Analytics (KDA) is the easiest option to develop a Flink app as it provides the underlying infrastructure. Re-implementing a solution from an AWS workshop, this series of posts discuss how to develop and deploy a fraud detection app using Kafka, Flink and DynamoDB. Part 1 covers local development using Docker while deployment via KDA will be discussed in part 2.