Automating DMS Endpoint S3 Target with a CloudFormation Custom Resource

BLOG ARTICLE

Whether you are establishing an AWS Data Lake, or migrating services to the cloud, the AWS Database Migration Service (DMS) should be a tool in your toolkit. DMS is a relatively straight forward service to configure and supported by AWS Cloudformation, however there is a specific pattern that requires a customised process. Enter CloudFormation Lambda backed Custom Resources which we can leverage to automate these processes and continue to reap the benefits of deployed resources via CloudFormation.

Before we dive into the solution, let’s quickly cover CloudFormation Custom Resources and DMS.

About Custom Resources

Custom resources are a custom CloudFormation type that enables us to write provisioning logic. Lambda-backed custom resources are associated with a Lambda function and will invoke that function whenever the custom resource is created, updated or deleted. CloudFormation will call the Lambda API and pass all the request information configured in custom resources properties block, enabling the automation of otherwise non-natively supported CloudFormation actions.

Find more on AWS Lambda-backed Custom Resources here.

About DMS

AWS Data Migration Service (DMS) provides the ability to migrate databases to AWS quickly and securely. DMS connects to your source database systems to continuously replicate your data to target datastores, while your source systems continue to function. DMS supports both homogenous migrations such as Oracle to Oracle and heterogenous migrations between different database platforms such as Oracle to Amazon Aurora. Additionally, DMS supports the streaming of data into Amazon S3 and peta-scale data warehouse service, Redshift.

For more AWS DMS here.

Using DMS

At a high level, Data Lake solutions consist of three key components. These include an Ingestion layer, Data Lake layer and Access or Data Lake consumption layer. The Ingestion layer consists of the services whose primary purpose is source system integration. These services facilitate data ingestion and delivery of source aligned data to the Landing or Raw datastores in the Data Lake layer. The Data Lake layer services are responsible for the Load and Transform of data to curated or staged sets of source aligned data. An example can include using glue crawlers and transforms to convert datasets from CSV to Parquet format, providing greater compression, better performance and reduced consumption costs when using a service like AWS Athena. The Access layer includes services used to consume source aligned data for exploration, analytics and modelling activities.

In the context of a DaaS platform, the AWS Data Migration Service (DMS) can be used in the Ingestion layer to connect to on-premises source database systems, and deliver bulk file loads and / or change data capture (CDC) payloads to a S3 Landing bucket in the Data Lake layer. The DMS service consists of DMS Instances, Source and Target Endpoints, and DMS tasks which are responsible for the ingestion activity. Tasks group Instances and Endpoints to define the data ingestion path between source and destination endpoints, and the datasets or tables to be consumed.

Challenge with DMS

While most of the DMS service is fully supported by CloudFormation, it does have one key restriction that can cause high operator fatigue. Cloudformation natively supports DMS target S3 Endpoints where the S3 Buckets exist in the same account as the configured DMS service. This is not the case where the target endpoint is a cross account S3 bucket. At the time of writing, cross account access must be requested by raising an AWS support case specifying the DMS and target accounts to be whitelisted. Once whitelisted, the following steps  must be followed in order, for each of the target endpoint created.

 

  1. Create an S3 bucket in the Data Lake (target) Account.
  2. Create the following IAM policy in the Data Lake Account specifying the bucket name from step 1 above.

{

  “Version”: “2012-10-17”,

  “Statement”: [

    {

      “Effect”: “Allow”,

      “Action”: [

        “s3:PutObject”,

        “s3:DeleteObject”,

        “s3:PutObjectTagging”

      ],

      “Resource”: [

        “arn:aws:s3:::<bucketname>*”

      ]

    },

    {

      “Effect”: “Allow”,

      “Action”: [

        “s3:ListBucket”

      ],

      “Resource”: [

        “arn:aws:s3:::<bucketname>*”

      ]

    }

  ]

}

  1. Create an IAM Role in the Data Lake account, assign it the policy created in step 2 and configure it with a trust relationship for the DMS service. The IAM roles trust policy should look like the example below.

{

  “Version”: “2012-10-17”,

  “Statement”: [

    {

      “Sid”: “”,

      “Effect”: “Allow”,

      “Principal”: {

        “Service”: “dms.amazonaws.com”

      },

      “Action”: “sts:AssumeRole”

    }

  ]

}

  1. Run the dms create-endpoint cli command in the DMS account, specifying the role in the Data Lake account similar to the example below:

aws dms create-endpoint –endpoint-identifier <endpoint-identifier> \

–endpoint-type target \

–engine-name s3 \

--s3-settings ServiceAccessRoleArn=arn:aws:iam::<DMSAccountId>:role/<iamRoleData LakeAccount>, BucketName=<bucketname>, CompressionType=NONE
  1. This command will provide an external Id in the response similar to the following.

“ExternalId”: “5f36b878-4c4c-4f15-94dc-ee2a92332570”

  1. Modify the IAM Role trust relationship in the Data Lake account by adding the external ID from the previous commands response. Your IAM Roles trust policy should now be similar to the one below:

{

  “Version”: “2012-10-17”,

  “Statement”: [

    {

      “Effect”: “Allow”,

      “Principal”: {

        “Service”: “dms.amazonaws.com”

      },

      “Action”: “sts:AssumeRole”,

      “Condition”: {

        “StringEquals”: {

          “sts:ExternalId”: “7b0cc851-ccbd-4dc5-a90a-8371ee700a78”

        }

      }

    }

  ]

}

  1. Rerun the aws dms create-endpoint cli command in the DMS account, which should successfully create the target endpoint pointing to the Data Lake S3 bucket created in step 1.
  1. Test Endpoint connectivity via the DMS console in the DMS account and create your task on top of the Target Endpoint.

 

Note: The same bucket may be used for multiple target endpoints, however each endpoint will require a unique IAM Role created in the Data Lake account. This process assumes that the Data Lake S3 buckets have been deployed and KMS keys configured.

While creating a small number of endpoints via this manual procedure will not pose significant operational burden, scaling this process across a dozen endpoints and multiple environments introduces challenges. These include compounding administrative overhead, susceptibility to manual configuration errors, introduces barriers to quick environment deployments and refreshes, and hinders the ability to automate deployment from Dev through to Production environments. Concretely, the requirement to create a dozen Target Endpoints across Dev, Test and Production would result in manually running the procedure above 36 times.

Solved through a Custom Resource

Luckily CloudFormation supports custom resources to cover this exact scenario.  In this case we can develop a Custom Resource to finalise the connection of the DMS task to the source system. At a high level we define a serverless function in cloudformation (in this case we have chosen to use a lambda backed function), together with the roles and policies that the function will use to run the custom resource lambda. We create a custom resource function which is referenced in the cloudformation serverless resource block, which I have written in Python.

Once these two components are in place,  we need to run the Cloudformation package command to upload custom resource code to an s3 bucket and prepare the cloudformation template to create and invoke the custom resource.

aws cloudformation package –template-file –s3-bucket

During stack creation, the custom lambda function is created and will be invoked every time it is referenced in the cloudformation template. A side benefit of using lambda backed custom resources is the ability to take advantage of lambda monitoring and logging via CloudWatch, which comes in handy when troubleshooting your custom resource.

CloudFormation Template

The Cloudformation snippet below defines the role and attached policy which will be assumed by the lambda function, the AWS::Serverless::Function resource “CreateDMSS3Endpoint” and a DMS Target Endpointpoint  across account S3 bucket.

A couple of things to highlight in the CloudFormation template include::

  • The role policy permissions include the ARNs (Amazon Resource Name) of cross account roles which will be assumed to create resources in the Data Lake account. In this situation I have chosen to assume a role in both accounts to scope down the lambda function role.
  • The Serverless Resource block references the location of the lambda function and the name of the custom resource in the lambda handler name
  • The DMS Target Endpoint resource “S3Endpoint1” has a type of Custom::CreateDMSS3Endpoint, which includes the serverless function resource name.
  • The Custom resource type’s ‘servicetoken’ property references the ARN of the Serverless resource.

The KMS Key ID associated with the Data Lake S3 bucket is passed into the value of the Endpoints S3 Settings “SSEKMDKEYID”.

#Define role and role policy to be used by the custom resources lambda function. This role will assume existing roles in the Data Lake and dms accounts to create the required resources.  Assume role permissions specifies six roles below corresponding to 2 accounts across 3 environments in this case.

 

CreateDMSEndpointRole:

   Type: AWS::IAM::Role

   Properties:

     AssumeRolePolicyDocument:

       Version: 2012-10-17

       Statement:

         – Effect: Allow

           Principal:

             Service:

               – lambda.amazonaws.com

           Action:

             – sts:AssumeRole

     Path: /

     Policies:

       – PolicyName: root

         PolicyDocument:

           Version: 2012-10-17

           Statement:

             – Effect: Allow

               Action:

                 – logs:CreateLogGroup

                 – logs:CreateLogStream

                 – logs:PutLogEvents

               Resource: ‘arn:aws:logs:*:*:*’

             – Effect: Allow

               Action:

                 – iam:PassRole

                 – iam:GetRole

                 – iam:CreatePolicy

                 – iam:CreateRole

               Resource: ‘*’

             – Effect: Allow

               Action: sts:AssumeRole

               Resource:

               – ‘arn:aws:iam::<accountid>:role/<Data Lake-dev-role-to-be-assumed>’

               – ‘arn:aws:iam::<accountid>:role/<dms-dev-role-to-be-assumed>’

               – ‘arn:aws:iam::<accountid>:role/<Data Lake-test-role-to-be-assumed>’

               – ‘arn:aws:iam::<accountid>:role/<dms-test-role-to-be-assumed>’

               – ‘arn:aws:iam::<accountid>:role/<Data Lake-prod-role-to-be-assumed>’

               – ‘arn:aws:iam::<accountid>:role/<dms-prod-role-to-be-assumed>’

 

#Here we define the serverless function which points to the code for our lambda backed custom resource.

 CreateDMSS3Endpoint:

   Type: AWS::Serverless::Function

   Properties:

     Handler: create-dms-endpoint.lambda_handler

     Runtime: python3.6

     CodeUri: create-dms-endpoint/

     Timeout: 300

     Role: !GetAtt CreateDMSEndpointRole.Arn

 

#Define the parameters that will be passed to our lambda function for dms endpoint creation.

  S3Endpoint1:

   Type: Custom::CreateDMSS3Endpoint

   Properties:

     ServiceToken: !GetAtt CreateDMSS3Endpoint.Arn

     POLICYNAME: “<policy-name>”

     ROLENAME: “<role-name>”

     ENDPOINTIDENTIFIER: “<endpoint-identifier>”

     S3SETTINGS:

       BUCKETNAME: <ingestion-bucket-name>

       BUCKETFOLDER: <ingestion-bucket-folder>

       COMPRESSIONTTYPE: <gzip | none>

       TIMESTAMPCOLUMNNAME: <TimeStampColumnName>

       DATAFORMAT: “parquet”

       ENCRYPTIONMODE: <sse-s3 | sse-kms>

       SSEKMSKEYID: <kms-key-arn>

     EXTRACONNECTIONATTRIBUTES

‘addColumnName=true;compressionType=NONE;csvDelimiter=,;csvRowDelimiter=\n;dataFormat=parquet;timestampColumnName=TIMESTAMP’

 ENVIRONMENT: <dev | test | prod>

Custom Resource Code

There are a few key components to highlight in the custom resource code.

The skeleton of the lambda_handler function code provided below, handles the three request types sent in the cloudformation event (Delete, Create and Modify), runs the respective code and returns a “Success” response status to CloudFormation on successful execution.

def lambda_handler(event, context):

  print(“Received event: ” + json.dumps(event, indent=2))

  responseData={}

  <some declared variables>

  try:

#Code to run when receiving a “delete” request type from cloudformation.

      if event[‘RequestType’] == ‘Delete’:

        

          <delete custom endpoint>

#Code to run when receiving a “create” request type from cloudformation.

      elif event[‘RequestType’] == ‘Create’:

      

        <create cross account custom endpoint>

#Code to run when receiving an “update” request type from cloudformation. Modify DMS endpoint api does not support this type of endpoint. You must delete and then re-create the endpoint.

      elif event[‘RequestType’] == ‘Update’:

     

       <modify cross account custom endpoint>

       

      responseStatus = ‘SUCCESS’

 

  except Exception as e:

      print(‘Failed to process:’, e)

      responseStatus = ‘FAILURE’

      responseData = {‘Failure’: ‘Something bad happened.’}

  send(event, context, responseStatus, responseData)

 

def send(event, context, responseStatus, responseData, physicalResourceId=None, noEcho=False):

   responseUrl = event[‘ResponseURL’]

   print(responseUrl)

   responseBody = {‘Status’: responseStatus,

                   ‘Reason’: ‘See the details in CloudWatch Log Stream: ‘ + context.log_stream_name,

                   ‘PhysicalResourceId’: physicalResourceId or context.log_stream_name,

                   ‘StackId’: event[‘StackId’],

                   ‘RequestId’: event[‘RequestId’],

                   ‘LogicalResourceId’: event[‘LogicalResourceId’],

                   ‘Data’: responseData}

   json_responseBody = json.dumps(responseBody)

   headers = {

       ‘content-type’ : ”,

       ‘content-length’ : str(len(json_responseBody))

   }

   try:

       response = requests.put(responseUrl,

                               data=json_responseBody,

                               headers=headers)

       print(“Status code: ” + response.reason)

   except Exception as e:

       print(“send(..) failed executing requests.put(..): ” + str(e))

In order to assume cross account roles, the following assume_role function takes a role arn and session_name and returns a session object which can be used other functions to bind the session credentials when calling aws APIs..

def assume_role(arn, session_name):

   client = boto3.client(‘sts’)

   response = client.assume_role(RoleArn=arn, RoleSessionName=session_name)

   session = Session(aws_access_key_id=response[‘Credentials’][‘AccessKeyId’],

                   aws_secret_access_key=response[‘Credentials’][‘SecretAccessKey’],

                   aws_session_token=response[‘Credentials’][‘SessionToken’])

   return session

To ensure that the same custom resource is useable across multiple environment (Dev, UAT and Prod), the CloudFormation DMS Endpoint resource is configured to pass an ‘ENVIRONMENT’ variable to the custom resource function which is used to determine which role to assume as shown in the code below.

Additionally the “ACCOUNT” argument passed to the get_env_arn function is supplied by the calling function.

def get_env_arn(ENVIRONMENT, ACCOUNT) :

  if ENVIRONMENT == “dev”:

      if ACCOUNT == “dlake”:

          arn=”arn:aws:iam::<accountid>:role/<Data Lake-dev-role-to-be-assumed>” #dlake_dev

          return arn

      else:

          arn=”arn:aws:iam::<accountid>:role/<dms-dev-role-to-be-assumed>” #dmsingest_dev

          return arn

  elif ENVIRONMENT == “test”:

      if ACCOUNT == “dlake”:

          arn=”arn:aws:iam::<accountid>:role/<Data Lake-test-role-to-be-assumed>” #dlake_test

          return arn

      else:

          arn=”arn:aws:iam::<accountid>:role/<dms-test-role-to-be-assumed>” #dmsingest_test

          return arn

  elif ENVIRONMENT == “prod”:

      if ACCOUNT == “dlake”:

          arn=”arn:aws:iam::<accountid>:role/<Data Lake-prod-role-to-be-assumed>” #dlake_prod

          return arn

      else:

          arn=”arn:aws:iam::<accountid>:role/<dms-prod-role-to-be-assumed>” #dmsingest_prod

          return arn

Putting this all together, here is an example of a function which calls ‘get_env_arn’ and’ assume_role’ functions to create an IAM Role in the Data Lake Account.

The ‘create_iam_role’ function takes three arguments, invokes the get_env_arn function passing ‘ENVIRONMENT’ variable value and ‘ACCOUNT’ value of “dlake” to  return the respective assume_role ARN. The returned ARN is then passed into the ‘assume_role’ function which returns the temporary credentials in an object named “session”, which is then used to run the subsequent IAM actions using the supplied credentials and creates an IAM Role in the Data Lake account. The ‘create_iam_role’ function returns the newly created IAM role ARN to the calling function.

def create_iam_role(ROLENAME, ASSUMEROLEPOLICYDOCUMENT, ENVIRONMENT):

   print(“Creating IAM Role”)

   arn=get_env_arn(ENVIRONMENT, “dlake”)

   session_name=ENVIRONMENT+”_dlake_deploy”

   session=assume_role(arn, session_name)

   iam = session.client(‘iam’)

   create_role = iam.create_role(RoleName=ROLENAME, AssumeRolePolicyDocument=ASSUMEROLEPOLICYDOCUMENT)

   IamRoleArn = create_role[‘Role’][‘Arn’]

   return IamRoleArn

In this way, we can define a number of functions to create, modify and delete resources, specifying the environment and account type to assume the appropriate role which will allow us to perform the required steps for creating S3 DMS Target Endpoints pointing to cross account S3 buckets.

Note: A gotcha to be aware of is that the DMS modify endpoint api does not work for cross account S3 bucket targets. Our solution was to delete and re-create the endpoint where CloudFormation sent an “Update” request type.

CONCLUSION

Creating custom resources can be time consuming and not recommended where resource creation is natively supported by AWS Cloudformation or in situations where the effort outweighs the benefit. When weighing up the benefits it is important to understand that custom resources enable cloud engineers to automate processes which remove significant administrative burden, promote consistency across environments and ultimately improve the availability and stability of customer services.

In our scenario we created a custom resource to automate a cross account multi-step process, reducing DMS endpoint creation time from hours to minutes. This enabled us to continue deploying DMS Endpoints via CloudFormation, which in turn we incorporated into a CI/CD deployment pipeline for controlled release across Development, Test and Production environments.