Today starts a new series of articles I’ll be writing where I walk through each of the questions that make up the AWS Well Architected Framework. The purpose of this series is to do a deep dive into what each question is looking to validate and some actions that organisations can take to uplift their current state.
For those who might be new to the idea of the AWS Well Architected Framework, you may find some of our previous posts useful (“Are you well architected” and “Are you well operated” are both great articles by Steve Mactaggart) .
Throughout each of these articles, I aim to take a look at:
- What is the scope of the question? Who’s responsible for answering it?
- The key Calls to Action the question aims to define?
- The possible answers to the question and what each one is looking for
- What good looks like, and
- Where you can go for more information and guidance in order to improve and implement best practices.
As it stands in February 2021, there are currently 56 questions in a Well Architected Review (we’ll exclude the lens for the moment) across five pillars. Today we will start working through the “Operational Excellence” pillar and focus on OPS1.
How do you determine what your priorities are?
“Everyone needs to understand their part in enabling business success. Have shared goals in order to set priorities for resources. This will maximise the benefits of your efforts.” [AWS Well Architected Framework OPS1]
On the face of it, this may sound like a pretty easy question to answer and one that might actually come across as a little silly. “Of course everybody knows what our priorities are” and “our priorities are …” but, those answers are not actually getting to the crux of what the question is asking. Taking a deeper look, we see that it’s focused on wanting to know “How priorities are determined” and less about “if and how they are communicated”. The possible answers to this question make this even clearer:
- Evaluate external customer needs
- Evaluate internal customer needs
- Evaluate governance requirements
- Evaluate compliance requirements
- Evaluate threat landscape
- Evaluate tradeoffs
- Manage benefits and risks
From this we can establish a couple of key themes:
Identify, Test and Validate customer needs and wants
One of the most important things that an organisation needs to do is to evaluate the needs of their internal and external stakeholders (including business, development and operations teams) in order to determine where to focus effort. By working with your key stakeholders, you ensure that you have a thorough understanding of the support that is required to achieve the desired outcomes.
As needs and wants are identified, they need to be tested and validated to ensure any assumptions are valid and correct. The needs of your customers today will be different from those of tomorrow, which is why continuous realignment and the ability to experiment and iterate are critical to the success of an organisation.
When running Well-Architected Reviews with customers, we like to ask organisations:
- Do you regularly investigate your customers needs? If so, through what mediums? How frequently? Not identifying your customers needs will result in wasted time developing things they don’t need or want.
- Are any of your findings available for review by the wider team? Do team members know where to go to get them? Information not available to the wider team isn’t beneficial and results in teams not aligning to the defined goals
- Are your findings used during activity planning sessions such as sprint kickoffs or feature roadmaps? Not regularly referring back to these discoveries will result in feature drift, misalignment and money/time being invested in things not directly related to addressing your customers needs. Make sure you continuously refer back to your findings and ensure that all activities are moving you in that direction.
- Are you tracking trends in long term requests and requirements? Tracking needs/wants over time will help identify longer term trends in what your customers are wanting to achieve. Do they want Machine Learning in your product? Or are they simply wanting greater insight from your data?
- Are you closing the loop by validating that actions taken by the organisation are addressing the requirements in the expected way? Just because you think you address their needs, doesn’t mean you actually did. Loop back to your customers and seek feedback to identify when/if further enhancements are required in order to truly gain maximum return from your investment.
Understand the competitive, governmental and compliance landscapes
Governance and compliance are complicated topics and quite often overlooked by organisations and IT teams, especially those operating in less regulated environments. It’s important to remember that all organisations are subject to some forms of compliance requirements. Whether it’s as mundane as PCI-DSS requirements for processing of credit card data, or adhering to the Privacy Act, everybody will have some level of compliance they must adhere to. The same holds true for your competitors and other external threats… actions and changes by other parties can significantly influence the actions your organisation needs to take and must be identified and tracked accordingly.
- Does your organisation have a clear understanding of what regulations and guidelines you must adhere to? Ignorance of requirements isn’t a defence and building a product/service with these requirements in mind is cheaper/quicker/easier than having to retroactively go back and change something after the fact.
- Do you regularly review the competitive landscape? If there is one thing we can count on it’s that things change. And not aligning to the latest requirements can cost you sales, customer sentiment or worse.
- Are they documented somewhere and available to all the responsible parties? You need to ensure that these requirements are available to and understood by your teams, otherwise you will wind up with products that aren’t aligned and require additional investment to rebase. Empower your teams from the beginning to ensure the best outcome.
- Have you identified which workloads and systems are in-scope for which policies? Without defined boundaries and clarity around what’s in-scope and out of scope, the easy path will always win and some systems will miss key compliance tasks. Ensure every workload and role is clearly tagged with what its compliance requirements are to remove ambiguity.
- Is that information represented in your IT systems, documentation and training? Teams experience churn over time and you can’t rely on muscle memory alone. Requirements, processes and procedures MUST be documented and available to all involved otherwise the effort you invest today will be lost tomorrow… and uplifting is time consuming, expensive and comes at the expense of more valuable work.
- Do you have systems in place to ensure you continuously review changes to policies, regulations and laws and play these back to the necessary teams? Policy and procedural bloat is what makes larger companies less agile over time. Adding deployment steps to support the ECS cluster because it’s got the payroll system on it is fine. But if you outsourced your payroll system six months ago, then those additional steps are just slowing you down now and resulting in unnecessary work. You need to continuously review your processes to make sure they are current and reflect the current IT and business landscape.
Managing tradeoffs, their benefits and risks
The third key theme in this question is around that of identifying tradeoffs that are made by the organisation, and managing the associated risks and benefits.
As an example, technical tradeoffs appear all the time when choosing AWS services. Traditional workloads (such as EC2 instances) require little upfront investment but are expensive to run over time. Other technologies (such as server less and containerisation) require large upfront investment to re-factor an application… but are much cheaper to run long term. Do you have the controls in place to decide when to trade on-going Opex costs or upfront Capex expenses?
Most decisions made by organisations involve tradeoffs in one form or another – but is your organisation considering them, tracking them and re-evaluating them on a regular cadence?
- Are you tracking all your tradeoffs and design decisions in a central register? If you’re not capturing when you make tradeoffs and design decisions, you lose the ability to re-assess them later. This results in “oh, we’ve always just done it that way” and processes that are slow, costly and overly complex.
- Are people empowered to provide input and feedback into the process? Nobody has a complete picture of the world and you need to gain insights from all levels of the organisation. Not enabling your teams to speak up and be heard will result in you setting them up for failure which wastes time, money and destroys employee morale.
- Are you regularly reviewing the risks & benefits associated with these decisions? Taking the example from earlier, prioritising features over cost optimisation might be fine for the short term, but what impact is it having on cash-flow? Incurring technical debt is a fact of life, but what is not paying it down doing to your availability targets and customer satisfaction? Make sure you are reviewing the impacts your choices are having and course correct as required.
- Do you have the tools, expertise and data to understand the tradeoffs being made and their potential impact to operational activities? An uninformed decision is as good as a guess and success can’t be obtained without metrics to measure it against. Without the right systems in place you won’t have the optics with which to make informed decisions which will result in wasted effort, mis-guided actions and a poor outcome for your customers.
Conclusion
As outlined above, this overarching question focuses on the operational priorities of the organisation and extends far beyond just the world of the workloads running in AWS. In order to ensure you are addressing this in a Well Architected manner you should be:
- Working with internal and external stakeholders to clearly determine key areas of focus. Ensure this is repeated regularly to ensure continued alignment/re-prioritisation.
- Continually review industry and governmental compliance requirements and ensure a process is in place to review and address any changes. Ensure mechanisms are in place to identify external changes in obligations.
- Always be evaluating threats to your business (business, competitive, operational, security) and maintain this information in a risk register. The cloud makes it easy to iterate and move quickly so take advantage of that fact.
- Foster a culture of regular re-evaluating tradeoffs that have been made over time. This is as much for business tradeoffs (one feature/product over another) as it is for technical (containers vs EC2 vs serverless).
So, what does good look like?
Now that we have a strong understanding of what goes into this question, and the key takeaways that you need to think about… what actions can you take to improve your organisation’s position? AWS has identified 5 key design principles that define operational excellence and two of these principles are particularly relevant to this question:
- Make frequent, small, reversible changes. Given the constant need to innovate and respond to customer, governmental, competitive factors… organisations need to be able make small incremental changes that can be easily tested, validated and reversed (if necessary) with minimal adverse impact to customers.
- Refine operations procedures frequently. With constant changes to our workloads, you need to be sure your procedures, policies and playbooks are also kept inline and up-to-date. Regular game-days and scenario testing will highlight gaps and provide opportunities to uplift any short-falls.
A great reference for the Operational Excellence pillar of the AWS Well Architected Review can be found on the AWS website. Keep an eye out for future articles as I continue to take a look at each of the questions that make up the Well Architected Framework.
In the meantime, if you’d like help aligning to the Well Architected Framework within your organisation don’t hesitate to reach out to the team at Cevo – as Advanced AWS Partners we have the expertise to guide your team through a Well Architected assessment (click here for more on the Cevo Well Architected Review).