CASE STUDY
DEVOPS

Communication & Collaboration: improving speed of deployment, quality and team efficiency

Industry

Property valuation services


Customer Overview

Our customer is an industry leader in independent property valuation & advisory services covering residential, commercial and agribusiness.


Business Challenge

Our customer has two major software platforms for performing valuations; one for handling one-off valuations for banks, the other for mass valuations for council and government.

The platforms have evolved over time into large monoliths, resulting in a number of technical challenges for the teams working on them.

The organisation had observed that their speed of delivering new features and the resulting quality had been slowly dropping over time. They recognised the need to bring in external expertise to assist in identifying a solution to resolve these issues.


Solution


Analysis

The first step in understanding what had lead to the perceived lack of delivery was to look at the way work flowed through the team end to end. This includes planning, requirements gathering, software development, testing and releasing. The aim of doing this was to see what could be contributing to decreased delivery and quality.

In doing so, some major pain points were identified:

1. Definition of work

It became clear there were inconsistent definition of work between parts of the team:

  • The project managers, BAs and subject matter experts were using Smartsheets and a shared drive for storing documents and communicating with clients.
  • The development team was using TFS for storing the tasks they were working on.
  • Often there was no clear mapping between the two systems, and most members of the team didn’t have access across all the systems where information was stored.

In addition, there was no defined work flow for tasks to move from planning into development, testing and deployment.


2. Communication silos

The inconsistent definition of work was a symptom of another issue; communication silos in the team.

A map of the typical touch points between people in the team looked like this:

communication network

The lead developer was responsible for reading requirements and then breaking them down into tasks for the rest of the development team, as well as communicating progress and prioritising work. Due to the disconnect of where work was stored (smartsheets, documents and then TFS) there was often misunderstanding of exactly what was being worked on.

The developers who were working on features were several communication points away from the source of the requirements (the client). In many instances they didn’t have context as to why they were being asked to build a particular feature, or what success for that feature looks like.

In addition, there was a lack of visibility for the project manager and BA on the amount of help desk items coming into the team, leading to frustration when progress on project items wasn’t as fast as expected.


3. Branching and deployment strategy

The development workflow relied on all testing occurring in the Test environment.

The typical branching workflow:

branching


Environment usage:

environment usage

The approach had the hallmarks of a short-lived branching strategy, which is often seen as the holy grail in continuous delivery. However many of the technical practices required to make this successful, such as automated testing, were not present.

This led to the test environment having a range of features deployed to it; some being almost finished and some in very early stages of development. To perform a production release, the code from the test branch was merged to master and then deployed out. This means that all features currently under development would be deployed, ready or not.

Strategies such as feature toggling can reduce the impact of this kind of release strategy, however this wasn’t being practised, and this led to identifying the next issue.


4. Release cadence

Due to the way features worked their way through the development and testing pipeline, production deployments were seen as relatively high risk. Even when a small feature was ready for production deployment, it often needed to be held up until the rest of the test branch was in a sufficiently good state.

As a result, the time between releases was starting to stretch out to four months or more.

How frequently an organisation pushes out production releases can be used as a general measure of its health (as has been researched and documented in the book Accelerate).



Implementation


1. Defining work and process

Based on the above analysis phase, there were two decisions to be made:

  1. Where to record work for the team?
  2. What process will the work follow?

Jira and Confluence were the preferred systems of record by the organisation, however lack of experience in their use had prevented them from being implemented. Cevo have used these products extensively across different clients, so assisting with the set-up and training was one of our first tasks.

A trap we often find with Jira is trying to extensively customise projects, issue types and work flows. Jira next-gen projects have stripped back many of these options to make for a more streamlined experience that supports a team’s way of working.

For process, a simple agile work flow was introduced. Agile can differ greatly as a way of working compared to what some organisations are used to - therefore taking into account an organisation’s capacity for change must also be considered when moving to agile. For this reason, the aspects we typically introduce first are:

  1. Sprint planning
  2. User stories
  3. Simple work flow
  4. Daily stand ups.

2. Promoting team collaboration

To remove the communication silos, we worked on promoting collaboration and self organisation amongst the team.

This was done by first trying to foster a safe environment for people to discuss ideas, solutions, problems and thoughts. Often a lack of communication from team members comes from fear of failure. Having a ‘manager/lead’ removes responsibility from a team member and placed it back onto the leader.

Agile ceremonies are designed with team collaboration and participation in mind. However when a team exhibits hierarchical behaviour, as seen on the communication network diagram, breaking those habits must come first.

For example, the whole team (not just developers) was invited to stand ups. Additionally, stand ups were changed from each person giving an update about what they worked on, to talking about each of the stories that was in progress or review. This switches the dynamic from focusing on each person, to focusing on what the team needs to do to get the work done. If there was something unclear or unknown about a story, the typical reaction would be to ask the lead dev to find out the answer from the BA. Instead, we encouraged the developer to ask the BA directly.

The goal was to build a more robust communication network amongst the team, while keeping a consistent interface with external parties such as clients and help desk.

communication network target


3. Feature environments

A total of six feature environments were created. The environments were not tied to specific code branches. Work was done on the release pipeline and application to parameterise any environment-specific configuration. This meant that any branch could be deployed to any environment quickly and painlessly.

Environment usage was coordinated though Confluence. Each environment had the following information recorded against it:

  • Environment name
  • URL for front end access
  • Jira issue numbers of the features that were being developed and tested
  • Who was using the environment.

The features were kept small; typically a feature branch would exist for one or two sprints.

This allowed internal testing to occur on features before merging them into the critical release path (dev/test/master branches). The test environment now reflected much more closely what a production release would look like at that point in time. If there were defects found in the testing environment, the team knew they needed to be addressed quickly.

The new branching work flow looked very similar, but there was now a quality check before being allow to merge.

branching


Testing now happened much earlier in the progression of a feature though environments.

environment usage


4. Aiming for regular releases

To pull together the work done in other uplift areas, releases were aligned with sprint length of two weeks. The definition of ‘done’ for stories was updated to include deploying to production.

A new column was added to the Jira wall labelled ready to deploy, highlighting the amount of work the team had finished developing and testing that wasn’t yet providing value to the customer.

Releasing more often also means the team practices it more often. They become more comfortable with the process and better at it. The amount of change in each release is smaller so tracking down problems becomes easier.

As a result, the team moved from releasing 28 stories in the first release being tracked in Jira to 16 for the last release.



Benefits

There has been noted improvement in communication across the team. At three consecutive retros there were items in the _what’s been good _column mentioning team communication and transparency of work being done. The improved communication also lead to a lower workload on the lead developer, as they are no longer the conduit between the developers and the rest of the project team.

The clear view of work in progress provided by Jira means that when new work comes into the team, assessing its priority against the existing projects, and identifying which developer is best placed to work on it is much easier. This lead to less context switching and being able to make an informed call on when stopping current work in progress is required.

Testing new features is now also easier, due to having them isolated in different environments. Coupled with the improved team communication it allows for much faster feedback cycle between the developers, testers and analysts.

The test environment can be used for UAT testing by customers on a near permanent basis, as the team have already tested features before merging and deploying them to the test environment. This allows customers to get a clear view of what is coming in the next release, and build their confidence in the release process.