Sliding Windows to the Cloud in Containers: IRESS

INDUSTRY

Financial and Insurance Services

TiMeframe

August 2019 – October 2020

CUSTOMER OVERVIEW

IRESS (ASX:IRE) is an Australian software company that provides market-leading financial software for financial services and wealth markets. Headquartered in Melbourne, they have offices in Sydney, Johannesburg, London, and Toronto. With an operating revenue above AU$500m in 2019, they are a significant and trusted player in the financial services software market
Case Study

BUSINESS CHALLENGE

A flagship component of IRESS’ software offerings, IPS, is a Windows-based application created in-house and running on Windows operating systems. The costs and level of effort associated with running an on-premises fleet of infrastructure were becoming significant; combined with a lack of automation, this meant that the ability to deliver features demanded by customers in a timely manner at a reasonable cost was very difficult.
 
In addition, IRESS had found that due to extensive lead times involved in procuring compute capacity on-premise, customers who had requested user acceptance testing environments were loath to relinquish them. This lead to IRESS carrying an ever-growing burden of hardware that they couldn’t decommission or repurpose.
 
In 2019, IRESS decided to embark on a migration from on-premise systems to the cloud — specifically, to AWS — and were facing a few challenges in the process. Although they are an organisation with deep experience in building, delivering and operating high-end financial systems, they had less consistent experience with cloud environments.
 
IRESS wanted assistance in four main areas:
  • reduce the cost associated with hosting the Windows-based application;
  • build a DevOps mindset and capability in their software delivery teams;
  • reduce the time and effort required to provision and maintain user acceptance testing environments; and
  • increase the pace of cloud migrations

At the same time, IRESS didn’t want to sacrifice quality; software had to be delivered at least as fast as usual, but in a way that reduced the amount of re-work.
Case Study

SOLUTION

Beginning in August 2019, IRESS began leveraging Cevo’s core competencies in the above mentioned areas. The two organisations collaborated to develop a two-part approach to assist IRESS with their needs:
 
First, to embed experienced engineers with the team undertaking the migration effort, so that knowledge transfer would happen naturally through the course of the project while developing tooling, processes and practices. Cevo also assisted in the rotation of developers through the migration team, to ensure that knowledge was distributed evenly and broadly.
 
Second, to assist in the design and implementation of reusable migration patterns which IRESS could then apply to additional workloads later on.
 
Cevo’s approach targeted several of our core DevOps transformation themes:
  • Everything As Code
  • Automated Software Delivery
  • Platform as a Service
  • Security
  • Cost Optimisation

The IRESS delivery pipeline for the IPS product developed and implemented with Cevo’s assistance comprises a mix of cloud-native technologies from AWS:

  • AWS Elastic Container Service (ECS) clusters, with Windows EC2 worker nodes and Windows containers
  • AWS Step Functions, for orchestrating complex workflows around backup, upgrade, migrate, and rename of deployed IPS application instances
  • AWS Lambda, for implementing work required by the Step Functions and performing additional “glue” roles in the infrastructure
  • AWS Secrets Manager, for credential security and automated rotation
  • AWS Systems Manager, for Parameter Store, Documents (for automation), and Session Manager (for login access)
  • AWS RDS (hosting SQL server)
  • AWS S3 (for SQL server native backups, transferring data from on-premises, and hosting large configuration objects)
  • AWS CloudWatch Logs and AWS CloudWatch Metrics (for observability, monitoring, and alerting)
  • AWS CloudWatch Events (for triggering scheduled tasks and responding to API events in the environments)
  • AWS CloudFormation (for deploying Infrastructure as Code)
  • AWS Backup (for taking and keeping secure backups of the RDS Database instances)
  • VPCs with subnets and security groups (for network configuration and security)
  • Human and programmatic access control via IAM

and third-party technologies that were already in place at IRESS:
  • GitHub code repositories
  • BuildKite pipelines for build, test, and deploy
  • Artifactory, for hosting and security scanning deployable artifacts
  • Terraform for infrastructure as code orchestration

IRESS began hosting production workloads of the new pipeline and platform in August 2020.
Case Study_1

BENEFITS

IRESS had modelled the cost of migrating from multi-tenant applications on Windows servers on-premises, and anticipated an increased spend as a result of single-tenanting IPS on Windows EC2.
 
Cevo’s involvement has enabled IRESS to multi-tenant on EC2 through the use of Windows Containers, which has brought forward the avoidance of increased Windows licensing costs by at least 6 months, reducing the projected expenditure per customer instance by an order of magnitude.
 
Prior to the engagement, the IPS development and delivery practices operated on physical servers hosted on-premises. Frequent human interactions were required, and the lead times to deliver a new customer instance were measured in weeks. Through the use of a DevOps workflow, from GitHub through BuildKite to Artifactory, and deployment of infrastructure components via an Infrastructure-as-Code approach through Terraform, provisioning times for new infrastructure has been reduced from in excess of 2 months to less than 1 hour (including human review and approval processes).
 
After three months of Cevo’s engagement into the technical and DevOps uplift effort, IRESS has put half a dozen software engineers through the process. A set of robust, repeatable pipelines driven by code deploy the application and its dependencies, and manage its configuration.
 
The time to provision an instance of the application has fallen from hours of intensive human-focussed work to less than 15 minutes with zero manual interaction beyond approval gates. Recovery times for application component failures have reduced from hours to seconds, thanks to the automated restart capability provided by ECS, and the time to perform standard operations on application instances, like renaming, copying, and upgrading, has been reduced from hours to minutes with minimal downtime.
 
Standard tasks such as patching and upgrading infrastructure have been fully automated – by replacing long-lived “pet” compute environments with disposable ECS worker nodes, and implementing automated scheduled builds of the application Docker images, Cevo ensured that the deployed workload will always be running in secure, patched and compliant environments.
 
Ultimately, IRESS will be able to serve their customers better by providing higher quality code, more rapidly; exploring options in a safer and more repeatable way; and reducing the turnaround time to provision user acceptance environments for key clients.