The AWS (Amazon Web Services) Landing Zone Accelerator (LZA) – Part 2

This blog is part of a multi-part series that talks about the AWS Landing Zone Accelerator solution and my experience in deploying it at scale, what is possible, what is not possible, some benefits and challenges in using the solution. In part 1 of the series, we covered: 

  • An overview of what LZA is 
  • Benefits of using LZA 
  • Where LZA Makes Sense to Deploy 
  • How much does LZA cost to implement
 

The source code can be found here:  

https://github.com/awslabs/landing-zone-accelerator-on-aws 

The LZA solution is a fully customisable and comprehensive platform solution built using AWS Cloud Development Kit (CDK) and is an extension to an existing AWS Control Tower deployment. The AWS Control Tower solution on its own is a great initial way of establishing a multi-account, AWS Organizations structure complete with consolidated billing (taking advantage of economies of scale for services usage charges) and a foundational IAM (Identity and Access Management) Identity Centre (formerly AWS Single Sign-On – SSO) deployment. The LZA extends Control Tower and builds a compliant, Well-Architected platform as a foundational Landing Zone. 

Deployment Time

The initial deployment takes several hours as it includes a Control Tower deployment as a prerequisite that takes a couple of hours to complete alone. Deployment of the foundational pipeline, code repository, roles, policies, and custom resources takes a further 1-2 hours.  

The pipeline runtime time takes around 50 mins per pipeline run – regardless of the size of the change. As more resources are added and configs get large – the pipeline runtime increases proportionate to the number of resources it needs to synthesise and evaluate. In a complex deployment with over 8 workload accounts, VPC’s, centralised egress Network VPC and complex Transit Gateway configuration, pipeline run time extended to around 1hr 30mins. Once the platform maturity stabilises – the number of times the pipeline needs to run will diminish making runtime less of an issue. 

Differentiating LZA From Other Solutions

Diving into LZA I was initially impressed by the level of detail given to sample code and configs and the general structure of the repository making navigation easy. It was immediately obvious this was a powerful solution that made usually complex tasks super easy through extensive abstraction and heavy use of AWS Organizations and Resource Access Manager (RAM) to share resources across multi-account patterns easily. A link to the example code samples (broken down by industry) can be found below: 

https://github.com/awslabs/landing-zone-accelerator-on-aws/tree/main/reference/sample-configurations 

The configuration management is simplified through a set of six primary config files written in YAML. There is a natural order of deployment that makes sense as highlighted below: 

Config File 

Description 

organizations-config  

Defines Organization Units (OU’s), Service Control Policies (SCP’s), tagging and backup policies 

accounts-config  

Defines accounts, root accounts and assigns accounts to OU’s 

iam-config  

Defines policies and permissions 

security-config  

Defines security services and configurations of security services 

network-config  

Defines all network components (routing, DNS, query, and flow logs) 

global-config 

Defines logs, tags, budgets, reports, and log retention policies 

The LZA Does Not Do Everything

When I got right into the detail, I was finding myself needing to carry out some manual tasks not captured under example configurations. Initially I thought these tasks such as DNS query and VPC flow logs along with Route53 DNS resolver rules were just a bit of a blind spot. I then stumbled on the “all-enabled” configuration samples and was simply blown away at just how much was possible. From AWS DNS Firewall Rule Groups to DNS Firewall Rulesets and more – there is a lot the LZA does and does well. I found the example I needed for Route53 Private Hosted Zone sharing, DNS Query Logs and VPC Flow Logs in addition to the managed firewall rules I was painstakingly adding to each VPC association. The “all-enabled” sample for each configuration file can be found here: 

https://github.com/awslabs/landing-zone-accelerator-on-aws/tree/main/source/packages/%40aws-accelerator/accelerator/test/configs/all-enabled 

It is here where you find examples for SSM documents, backup policies, advanced networking, using, and deploying custom CloudFormation templates and more. For deploying items and services not natively supported by the LZA solution, it also supports the use of custom CloudFormation templates declared inside a “customizations-config.yaml” file – example here: https://github.com/awslabs/landing-zone-accelerator-on-aws/blob/main/source/packages/%40aws-accelerator/accelerator/test/configs/all-enabled/customizations-config.yaml 

In our recent customer use case – this was used to deploy CDK bootstrap templates for workload accounts as well as roles and certificates for IAM Roles Anywhere integration with Azure DevOps. This allowed for Azure DevOps to be used as CDK deployment pipelines for deploying workload resources into workload accounts using temporary credentials – preventing long-lived access and secret keys from being stored outside of AWS.  

This thing is a beast.

Limitations of LZA

The same limitations of CloudFormation apply to the deployment of resources. That is that CloudFormation logic is to “build before destroy.” Where deployed resources require updating – note that some resources like transit gateway attachments are a 1:1 relationship. This means that to update those resources – you must first delete existing resources first – so you will need to “destroy before build” by destroying existing resources in a pipeline release first before another pipeline release to build the connections. CloudFormation rules still apply with regards to rollbacks – you will need to resolve any rollback failed states prior to re-running the pipeline / retry failed components in Code Pipeline. 

A couple of key services cannot be deployed through LZA including Amazon Inspector, AWS Config Conformance Packs, creation of the Route53 Private Hosted Zones and the Route53 Private Hosted Zone sharing invitations when using centralised Route53 DNS inbound and outbound resolvers. The role of Amazon Inspector is to look for runtime vulnerabilities of Common Vulnerabilities and Exploits (CVE’s) that can be reported into Security Hub. This is hardly a chore however as Inspector can be deployed as a once-off deployment through the AWS Console where you can turn it on for all new accounts created under AWS Organizations as a once-off setup (automate where it makes sense to) in the Management account.  

For Route53 Private Hosted Zones, I just used the AWS Console to create them and to enable Private Hosted Zone sharing, I had to create and accept the invitations manually using the AWS CLI as discussed here: 

https://docs.aws.amazon.com/cli/latest/reference/route53/create-vpc-association-authorization.html 

The DNS Rules association to the VPC was handled by the LZA network-config. This made the whole thing semi-automated but super powerful given I was able to declare all outbound resolver rules as a block of code under each VPC configuration block – it was the same syntax, so it was a copy-paste job in the config code. Simples! 

When attempting to use existing SSM Automation Documents for AWS Config Remediation Rules – the LZA does not (as of 18th April 2024) support the use of existing SSM Automation Documents and requires LZA users to create their own. This is due to a limitation of the expression being applied to the document name pattern. The following Pull-Request is merged and released under the next LZA version release (v1.7.0): 

https://github.com/awslabs/landing-zone-accelerator-on-aws/pull/425 

Services Capable of Configuring With LZA

With LZA, you can enable a central Security Hub into a central Audit account for logs, compliance, and trail consolidation. In addition, LZA deploys and configures services such as:  

Configurable Item 

Description 

KMS Keys  

Used by encryption services 

Macie  

Used to discover Personal Identifiable Information (PII) in storage services such as S3 

GuardDuty  

Used for anomaly detection across AWS accounts 

AWS Config Rules  

AWS Config is used to enable compliance rules against a set of enabled standards (which can be set by LZA) – report’s findings into Security Hub 

AWS Config Remediation Rules  

Config remediation rules ensure config rules are enforced to ensure compliance which are applied using Systems Manager Automation Documents via a Lambda function invoked by AWS Config rule alerts through EventBridge 

Amazon Detective  

Used for Security Investigations to determine who, when and what was done as part of an investigation 

SSM Automation Documents  

A set of automation documents describing an AWS API call to carry out a task on a user’s behalf without them needing access to carry out the task 

Audit Manager  

Audit Manager builds audit reports for security auditing purposes 

SNS Topics and Subscriptions  

Simple Notification Services alerts subscribers to new events captured under topics  

IAM Access Analyzer  

A visual tool to determine what a role or policy could carry out 

IAM Password Policies  

Used to enforce password complexity and rotation policies against IAM users 

CloudWatch Log Groups, Metrics, Alarms  

A collection of logs and log groups, metrics for insights and auto-scaling and alarms used to trigger scaling events 

CloudTrail  

Used to find out who carried out an AWS API call and what was done 

Log Retention Policies  

Important for compliance purposes where evidence must be stored or where you want records to expire to save on storage costs after their useful life 

Resource Policies  

Used for checking, enforcing and remediating resource policies such as permissions 

SCP’s  

Service Control Policies provide a list of approved services and their limits which can be applied to an individual account or an Organization Unit (OU) 

Central Logging S3 Buckets  

Used for centrally locating audit logs for security investigations and auditing purposes 

Trusted Advisor  

Used to provide AWS guidance on the usage of services and resources using historic metric and log data to determine advice 

Cost and Usage Reports  

Used in financial reporting for cost allocation and visibility 

Budgets  

Used to provide alerts when cost budget limits are in jeopardy of being breached and/or breached 

SSM Parameters  

Used by applications for configuration and environment parameters 

SSM Inventory  

Enables visibility into packages and software running on containers, lambda functions and EC2 instances running the Systems Manager Agent 

Policy Sets, Role Sets and Permission Sets  

Used by IAM Identity Centre for determining permissions a user has when accessing accounts and services 

IAM Users and Secrets Sets  

Used in organisations where other options like federated access and/or existing directory services do not exist or are not yet integrated with AWS 

Managed AD (Active Directory)  

A fully managed Microsoft Active Directory service managed through AWS and LZA 

Transit Gateways  

Used to allow transitive access between VPC’s and on-premises networks – acts as a centralised cloud routing virtual device in each AWS region 

Direct Connect Gateways  

Used to consolidate multiple Direct Connect connections as transit Virtual Interfaces (VIf’s) and allows connection to a Transit Gateway 

Transit VIF’s  

Transit VIf’s are used to describe a connection between a Direct Connect Gateway and an on-premises connection over a dedicated connection instead of the Internet 

VPN’s  

Used to connect remote locations to the AWS cloud using encrypted tunnels over the Internet 

Virtual Private Gateways  

Otherwise known as a VGW – used as an aggregation point for direct connectivity between a VPC and external networks 

VPC’s  

A Virtual Private Cloud is a network CIDR range used to describe a network segment 

Subnets  

Subnets are deployed into VPC’s as a subnet of the VPC CIDR range 

Transit Gateway Attachments  

Used to attach a VPC to a transit gateway – used in place of a VGW 

Internet Gateways  

Used to provide Internet access into or from resources residing within a VPC 

NAT Gateways  

Used for outbound connectivity to the Internet converting private IP addresses in a VPC subnet to a Public IP address routable by the Internet 

VPC Gateway and Interface Endpoints  

VPC endpoints are used to provide direct access to AWS services using the AWS backbone instead of the Internet to reach service API endpoints. 

AWS Firewalls, Policies and Firewall Rules (Network and DNS)  

AWS Firewall is a managed service application and network firewall service with policies and rules that can be managed through IaC (Infrastructure as Code) using AWS API’s 

AWS Firewall Manager  

The Firewall Manager is a premium service used to manage many AWS Firewalls, their rules and provide insights into how rules are being used 

Route53 DNS Inbound and Outbound Resolvers  

Used as IP targets to route DNS queries between AWS and on-premises resources – outbound for AWS to on-premises and inbound for on-premises to AWS 

Route53 DNS Firewall Rules  

Used for internal rule lookups for private internal domain names and for external DNS queries to the Internet 

NACL’s  

Non-stateful network firewall applied to an entire subnet – can contain deny rules with an implicit deny rule at the end 

Security-Groups  

A stateful network firewall applied to individual network interfaces attached to a subnet – can only support permit rules with an implicit deny at the end 

Transit Gateway and VPC Route Tables  

Route tables determine where network flows as the “next-hop” and influences the network traffic path 

DNS Query Logs  

Used to show what resources are querying for DNS 

VPC Flow Logs  

Used to show where traffic in a VPC is flowing for troubleshooting 

DHCP Options  

Used to determine and influence DNS servers, domain names, or Network Time Protocol (NTP) servers used by the devices in your VPC  

IPAM Pools  

Allows for IP Address Management using the concept of IP address pools and CIDR notations – /xx 

Gateway Load Balancers  

Used where there is a need to use third-party appliances for network traffic inspection (AWS Network Firewall deploys and manages this on your behalf) 

Prefix Lists  

Used to define lists of network IP addresses and/or subnets for re-use where there are patterns of access with the need to duplicate configuration – easier change management by applying the same change to anywhere already using the prefix-list 

Endpoint Policies  

Used to determine what resources can consume a given service through an endpoint 

VPC Peering  

Used to join VPC’s directly together – however is non-transitive (which is where Transit Gateway fits in) 

Transit Gateway Peering  

Used to join transit gateways across AWS regions 

Customer Gateways  

Used to predefine the configuration of on-premises devices for configuration purposes and used with VGW connectivity for Direct Connect and VPN’s 

Certificates with AWS Certificate Manager 

Used for hosting private signed certificates to allow services communicating with TLS to trust a certificate chain for authentication prior to authorisation occurring 

Recommendations

As for any development process, it is super important to be able to test and evaluate configurations ahead of deployment into production. To assist with this, it is recommended to create a development LZA deployment using a dedicated set of accounts and use these test accounts to evaluate impacts to application of SCP’s, roles, and policies. Most configuration can be fully evaluated however network configuration is difficult to test due to physical network dependencies and potential of overlapping CIDR ranges. 

For network configuration – begin configuring any central Internet egress VPC components first to evaluate configuration logic and scale, before configuring development environments and expanding to any UAT, staging, and production environments. When implementing small-scale or decentralised Internet egress patterns, start with development environments and expand from there to grow confidence with processes, configurations, and release pipeline. 

Spend some time evaluating specific requirements and evaluate the natural deployment dependencies for your deployment. Evaluating what is possible through the “all-enabled” config files examples goes a long way to understanding the configuration of tasks. You will need to iteratively build, and it is recommended to keep configuration changes small with a view of continuous development. 

Conclusion

As a cloud consultancy, Cevo works with many different industry verticals and organisations across the full spectrum of their cloud adoption journey. Organisations embarking on the “Cloud 2.0” journey and using older platform accelerators will benefit from LZA in simplifying the operating model while remaining and maintaining compliance at scale.  

When you consider cloud security posture and with a lens of the AWS Well-Architected Framework – it is easy to see how LZA positions itself well against all six pillars of Well-Architected principles. With a deep focus on the Security pillar but also touching on the remaining five pillars – Operational Excellence, Reliability, Performance Efficiency, Cost Optimisation and Sustainability.

If you or your organisation are considering an uplift to an existing platform or are new to AWS – consider reaching out to Cevo or contact me for a confidential discussion on how AWS LZA and Cevo’s Launch platform capability practices with exceptional experience could benefit you. 

Enjoyed this blog?

Share it with your network!

Move faster with confidence