Planning account migrations between AWS organisations

As more companies move to the cloud with AWS, many are finding success in adopting multi-account architecture to house their various workloads. This article is a non-exhaustive guide on some of the steps which need to be taken before you begin your account migration journey.

Develop and Test Apache Spark Apps for EMR Remotely Using Visual Studio Code

We will discuss how to set up a remote dev environment on an EMR cluster deployed in a private subnet with VPN and the VS Code remote SSH extension. Typical Spark development examples will be illustrated while sharing the cluster with multiple users. Overall it brings an effective way of developing Spark apps on EMR, which improves developer experience significantly.

Manage EMR on EKS with Terraform

We’ll discuss how to provision and manage Spark jobs on EMR on EKS with Terraform. Amazon EKS Blueprints for Terraform will be used for provisioning EKS, EMR virtual cluster and related resources. Also Spark job autoscaling will be managed by Karpenter where two Spark jobs with and without Dynamic Resource Allocation (DRA) will be compared.

Private Host React App in AWS

This post takes you through the steps required to privately host the React application in AWS, including design architecture, installation of dependencies, a link to Git repository, creating React app, Dockerfile and Nginx configuration, creating necessary AWS services, login to AWS and creating the Makerfile.

Revisit AWS Lambda Invoke Function Operator of Apache Airflow

We’ll discuss limitations of the Lambda invoke function operator of Apache Airflow and create a custom Lambda operator. The custom operator extends the existing one and it reports the invocation result of a function correctly and records the exact error message from failure.

Serverless Application Model (SAM) for Data Professionals

We’ll discuss how to build a serverless data processing application using the Serverless Application Model (SAM). A Lambda function is developed, which is triggered whenever an object is created in a S3 bucket. 3rd party packages are necessary for data processing and they are made available by Lambda layers.