This blog is part of a multi-part series that talks about the AWS Landing Zone Accelerator solution and my experience in deploying it at scale, what is possible, what is not possible, some benefits and challenges in using the solution. In part 1 of the series, we covered:
- An overview of what LZA is
- Benefits of using LZA
- Where LZA Makes Sense to Deploy
- How much does LZA cost to implement
The source code can be found here:
https://github.com/awslabs/landing-zone-accelerator-on-aws
The LZA solution is a fully customisable and comprehensive platform solution built using AWS Cloud Development Kit (CDK) and is an extension to an existing AWS Control Tower deployment. The AWS Control Tower solution on its own is a great initial way of establishing a multi-account, AWS Organizations structure complete with consolidated billing (taking advantage of economies of scale for services usage charges) and a foundational IAM (Identity and Access Management) Identity Centre (formerly AWS Single Sign-On – SSO) deployment. The LZA extends Control Tower and builds a compliant, Well-Architected platform as a foundational Landing Zone.
Deployment Time
The initial deployment takes several hours as it includes a Control Tower deployment as a prerequisite that takes a couple of hours to complete alone. Deployment of the foundational pipeline, code repository, roles, policies, and custom resources takes a further 1-2 hours. Â
The pipeline runtime time takes around 50 mins per pipeline run – regardless of the size of the change. As more resources are added and configs get large – the pipeline runtime increases proportionate to the number of resources it needs to synthesise and evaluate. In a complex deployment with over 8 workload accounts, VPC’s, centralised egress Network VPC and complex Transit Gateway configuration, pipeline run time extended to around 1hr 30mins. Once the platform maturity stabilises – the number of times the pipeline needs to run will diminish making runtime less of an issue.Â
Differentiating LZA From Other Solutions
Diving into LZA I was initially impressed by the level of detail given to sample code and configs and the general structure of the repository making navigation easy. It was immediately obvious this was a powerful solution that made usually complex tasks super easy through extensive abstraction and heavy use of AWS Organizations and Resource Access Manager (RAM) to share resources across multi-account patterns easily. A link to the example code samples (broken down by industry) can be found below:Â
https://github.com/awslabs/landing-zone-accelerator-on-aws/tree/main/reference/sample-configurationsÂ
The configuration management is simplified through a set of six primary config files written in YAML. There is a natural order of deployment that makes sense as highlighted below:Â
Config File | Description |
organizations-config  | Defines Organization Units (OU’s), Service Control Policies (SCP’s), tagging and backup policies |
accounts-config  | Defines accounts, root accounts and assigns accounts to OU’s |
iam-config  | Defines policies and permissions |
security-config  | Defines security services and configurations of security services |
network-config  | Defines all network components (routing, DNS, query, and flow logs) |
global-config | Defines logs, tags, budgets, reports, and log retention policies |
The LZA Does Not Do Everything
When I got right into the detail, I was finding myself needing to carry out some manual tasks not captured under example configurations. Initially I thought these tasks such as DNS query and VPC flow logs along with Route53 DNS resolver rules were just a bit of a blind spot. I then stumbled on the “all-enabled” configuration samples and was simply blown away at just how much was possible. From AWS DNS Firewall Rule Groups to DNS Firewall Rulesets and more – there is a lot the LZA does and does well. I found the example I needed for Route53 Private Hosted Zone sharing, DNS Query Logs and VPC Flow Logs in addition to the managed firewall rules I was painstakingly adding to each VPC association. The “all-enabled” sample for each configuration file can be found here:Â
It is here where you find examples for SSM documents, backup policies, advanced networking, using, and deploying custom CloudFormation templates and more. For deploying items and services not natively supported by the LZA solution, it also supports the use of custom CloudFormation templates declared inside a “customizations-config.yaml” file – example here: https://github.com/awslabs/landing-zone-accelerator-on-aws/blob/main/source/packages/%40aws-accelerator/accelerator/test/configs/all-enabled/customizations-config.yamlÂ
In our recent customer use case – this was used to deploy CDK bootstrap templates for workload accounts as well as roles and certificates for IAM Roles Anywhere integration with Azure DevOps. This allowed for Azure DevOps to be used as CDK deployment pipelines for deploying workload resources into workload accounts using temporary credentials – preventing long-lived access and secret keys from being stored outside of AWS. Â
This thing is a beast.
Limitations of LZA
The same limitations of CloudFormation apply to the deployment of resources. That is that CloudFormation logic is to “build before destroy.” Where deployed resources require updating – note that some resources like transit gateway attachments are a 1:1 relationship. This means that to update those resources – you must first delete existing resources first – so you will need to “destroy before build” by destroying existing resources in a pipeline release first before another pipeline release to build the connections. CloudFormation rules still apply with regards to rollbacks – you will need to resolve any rollback failed states prior to re-running the pipeline / retry failed components in Code Pipeline.Â
A couple of key services cannot be deployed through LZA including Amazon Inspector, AWS Config Conformance Packs, creation of the Route53 Private Hosted Zones and the Route53 Private Hosted Zone sharing invitations when using centralised Route53 DNS inbound and outbound resolvers. The role of Amazon Inspector is to look for runtime vulnerabilities of Common Vulnerabilities and Exploits (CVE’s) that can be reported into Security Hub. This is hardly a chore however as Inspector can be deployed as a once-off deployment through the AWS Console where you can turn it on for all new accounts created under AWS Organizations as a once-off setup (automate where it makes sense to) in the Management account. Â
For Route53 Private Hosted Zones, I just used the AWS Console to create them and to enable Private Hosted Zone sharing, I had to create and accept the invitations manually using the AWS CLI as discussed here:Â
https://docs.aws.amazon.com/cli/latest/reference/route53/create-vpc-association-authorization.htmlÂ
The DNS Rules association to the VPC was handled by the LZA network-config. This made the whole thing semi-automated but super powerful given I was able to declare all outbound resolver rules as a block of code under each VPC configuration block – it was the same syntax, so it was a copy-paste job in the config code. Simples!Â
When attempting to use existing SSM Automation Documents for AWS Config Remediation Rules – the LZA does not (as of 18th April 2024) support the use of existing SSM Automation Documents and requires LZA users to create their own. This is due to a limitation of the expression being applied to the document name pattern. The following Pull-Request is merged and released under the next LZA version release (v1.7.0):Â
https://github.com/awslabs/landing-zone-accelerator-on-aws/pull/425Â
Services Capable of Configuring With LZA
With LZA, you can enable a central Security Hub into a central Audit account for logs, compliance, and trail consolidation. In addition, LZA deploys and configures services such as:Â Â
Configurable Item | Description |
KMS Keys  | Used by encryption services |
Macie  | Used to discover Personal Identifiable Information (PII) in storage services such as S3 |
GuardDuty  | Used for anomaly detection across AWS accounts |
AWS Config Rules  | AWS Config is used to enable compliance rules against a set of enabled standards (which can be set by LZA) – report’s findings into Security Hub |
AWS Config Remediation Rules  | Config remediation rules ensure config rules are enforced to ensure compliance which are applied using Systems Manager Automation Documents via a Lambda function invoked by AWS Config rule alerts through EventBridge |
Amazon Detective  | Used for Security Investigations to determine who, when and what was done as part of an investigation |
SSM Automation Documents  | A set of automation documents describing an AWS API call to carry out a task on a user’s behalf without them needing access to carry out the task |
Audit Manager  | Audit Manager builds audit reports for security auditing purposes |
SNS Topics and Subscriptions  | Simple Notification Services alerts subscribers to new events captured under topics  |
IAM Access Analyzer  | A visual tool to determine what a role or policy could carry out |
IAM Password Policies  | Used to enforce password complexity and rotation policies against IAM users |
CloudWatch Log Groups, Metrics, Alarms  | A collection of logs and log groups, metrics for insights and auto-scaling and alarms used to trigger scaling events |
CloudTrail  | Used to find out who carried out an AWS API call and what was done |
Log Retention Policies  | Important for compliance purposes where evidence must be stored or where you want records to expire to save on storage costs after their useful life |
Resource Policies  | Used for checking, enforcing and remediating resource policies such as permissions |
SCP’s  | Service Control Policies provide a list of approved services and their limits which can be applied to an individual account or an Organization Unit (OU) |
Central Logging S3 Buckets  | Used for centrally locating audit logs for security investigations and auditing purposes |
Trusted Advisor  | Used to provide AWS guidance on the usage of services and resources using historic metric and log data to determine advice |
Cost and Usage Reports  | Used in financial reporting for cost allocation and visibility |
Budgets  | Used to provide alerts when cost budget limits are in jeopardy of being breached and/or breached |
SSM Parameters  | Used by applications for configuration and environment parameters |
SSM Inventory  | Enables visibility into packages and software running on containers, lambda functions and EC2 instances running the Systems Manager Agent |
Policy Sets, Role Sets and Permission Sets  | Used by IAM Identity Centre for determining permissions a user has when accessing accounts and services |
IAM Users and Secrets Sets  | Used in organisations where other options like federated access and/or existing directory services do not exist or are not yet integrated with AWS |
Managed AD (Active Directory)Â Â | A fully managed Microsoft Active Directory service managed through AWS and LZAÂ |
Transit Gateways  | Used to allow transitive access between VPC’s and on-premises networks – acts as a centralised cloud routing virtual device in each AWS region |
Direct Connect Gateways  | Used to consolidate multiple Direct Connect connections as transit Virtual Interfaces (VIf’s) and allows connection to a Transit Gateway |
Transit VIF’s  | Transit VIf’s are used to describe a connection between a Direct Connect Gateway and an on-premises connection over a dedicated connection instead of the Internet |
VPN’s  | Used to connect remote locations to the AWS cloud using encrypted tunnels over the Internet |
Virtual Private Gateways  | Otherwise known as a VGW – used as an aggregation point for direct connectivity between a VPC and external networks |
VPC’s  | A Virtual Private Cloud is a network CIDR range used to describe a network segment |
Subnets  | Subnets are deployed into VPC’s as a subnet of the VPC CIDR range |
Transit Gateway Attachments  | Used to attach a VPC to a transit gateway – used in place of a VGW |
Internet Gateways  | Used to provide Internet access into or from resources residing within a VPC |
NAT Gateways  | Used for outbound connectivity to the Internet converting private IP addresses in a VPC subnet to a Public IP address routable by the Internet |
VPC Gateway and Interface Endpoints  | VPC endpoints are used to provide direct access to AWS services using the AWS backbone instead of the Internet to reach service API endpoints. |
AWS Firewalls, Policies and Firewall Rules (Network and DNS)  | AWS Firewall is a managed service application and network firewall service with policies and rules that can be managed through IaC (Infrastructure as Code) using AWS API’s |
AWS Firewall Manager  | The Firewall Manager is a premium service used to manage many AWS Firewalls, their rules and provide insights into how rules are being used |
Route53 DNS Inbound and Outbound Resolvers  | Used as IP targets to route DNS queries between AWS and on-premises resources – outbound for AWS to on-premises and inbound for on-premises to AWS |
Route53 DNS Firewall Rules  | Used for internal rule lookups for private internal domain names and for external DNS queries to the Internet |
NACL’s  | Non-stateful network firewall applied to an entire subnet – can contain deny rules with an implicit deny rule at the end |
Security-Groups  | A stateful network firewall applied to individual network interfaces attached to a subnet – can only support permit rules with an implicit deny at the end |
Transit Gateway and VPC Route Tables  | Route tables determine where network flows as the “next-hop” and influences the network traffic path |
DNS Query Logs  | Used to show what resources are querying for DNS |
VPC Flow Logs  | Used to show where traffic in a VPC is flowing for troubleshooting |
DHCP Options  | Used to determine and influence DNS servers, domain names, or Network Time Protocol (NTP) servers used by the devices in your VPC  |
IPAM Pools  | Allows for IP Address Management using the concept of IP address pools and CIDR notations – /xx |
Gateway Load Balancers  | Used where there is a need to use third-party appliances for network traffic inspection (AWS Network Firewall deploys and manages this on your behalf) |
Prefix Lists  | Used to define lists of network IP addresses and/or subnets for re-use where there are patterns of access with the need to duplicate configuration – easier change management by applying the same change to anywhere already using the prefix-list |
Endpoint Policies  | Used to determine what resources can consume a given service through an endpoint |
VPC Peering  | Used to join VPC’s directly together – however is non-transitive (which is where Transit Gateway fits in) |
Transit Gateway Peering  | Used to join transit gateways across AWS regions |
Customer Gateways  | Used to predefine the configuration of on-premises devices for configuration purposes and used with VGW connectivity for Direct Connect and VPN’s |
Certificates with AWS Certificate Manager | Used for hosting private signed certificates to allow services communicating with TLS to trust a certificate chain for authentication prior to authorisation occurring |
Recommendations
As for any development process, it is super important to be able to test and evaluate configurations ahead of deployment into production. To assist with this, it is recommended to create a development LZA deployment using a dedicated set of accounts and use these test accounts to evaluate impacts to application of SCP’s, roles, and policies. Most configuration can be fully evaluated however network configuration is difficult to test due to physical network dependencies and potential of overlapping CIDR ranges.Â
For network configuration – begin configuring any central Internet egress VPC components first to evaluate configuration logic and scale, before configuring development environments and expanding to any UAT, staging, and production environments. When implementing small-scale or decentralised Internet egress patterns, start with development environments and expand from there to grow confidence with processes, configurations, and release pipeline.Â
Spend some time evaluating specific requirements and evaluate the natural deployment dependencies for your deployment. Evaluating what is possible through the “all-enabled” config files examples goes a long way to understanding the configuration of tasks. You will need to iteratively build, and it is recommended to keep configuration changes small with a view of continuous development.Â
Conclusion
As a cloud consultancy, Cevo works with many different industry verticals and organisations across the full spectrum of their cloud adoption journey. Organisations embarking on the “Cloud 2.0” journey and using older platform accelerators will benefit from LZA in simplifying the operating model while remaining and maintaining compliance at scale. Â
When you consider cloud security posture and with a lens of the AWS Well-Architected Framework – it is easy to see how LZA positions itself well against all six pillars of Well-Architected principles. With a deep focus on the Security pillar but also touching on the remaining five pillars – Operational Excellence, Reliability, Performance Efficiency, Cost Optimisation and Sustainability.
If you or your organisation are considering an uplift to an existing platform or are new to AWS – consider reaching out to Cevo or contact me for a confidential discussion on how AWS LZA and Cevo’s Launch platform capability practices with exceptional experience could benefit you.Â