DevOps Metrics that Matter: Measuring Speed, Security, and Success

Maintaining a balance between speed and security can only be achieved by tracking key metrics in DevOps. Here we look at what metrics you should be tracking to inform continuous improvement.

The importance of DevOps

From banking to hospitality, and from insurance to healthcare, DevOps is gaining in popularity and importance as a method of achieving greater efficiencies, faster deliveries, increased collaboration and enhanced agility. It’s no wonder, then, that around 80% of organisations in a variety of sectors are using DevOps to deploy software and prevent expensive downtime.

Downtime costs vary from industry to industry but a recent report estimated that it costs:

IT industry – up to £345,000 per hour
Healthcare – up to £488,823 per hour
Retail – up to £845,487 per hour
The energy industry – up to £1,906,189 per hour
Brokerage services – up to £4,982,239 per hour.

DevOps, therefore, offers vital practices for enabling automation, continual monitoring and rapid response to incidents that can reduce the probability of a downtime event and shorten its duration, saving huge amounts of money.

Metric measurement

DevOps Research and Assessment (DORA) offers a set of standard metrics that can be used to evaluate process performance, and provide information about response speed, average code deployment time, iteration frequency and failure insight. DORA can improve collaboration and enhance performance, while driving velocity, by setting goals based on existing performance indicators and measuring how performance progresses against the goals. It is critical for giving organisations estimates of response times, improving planning, identifying areas that need improvement and building a consensus for investment targets.

DORA measures metrics such as:

Mean time to recovery (MTTR) – probably one of the most important metrics, MTTR, or time to restore service- measures how long it takes a team to identify and recover from a total or partial service interruption or a total failure. MTTR is measured from the precise time an incident occurs until it is resolved. The quicker a team can respond the better, but it can vary from between an hour to up to a week, depending on what caused the interruption.

DevOps are benchmarked against four standards in MTTR:

Elite – less than an hour

High – less than a day

Medium – between a day and a week

Low – over six months.

Teams must be able to identify if and when a failure has taken place and deploy a fix, or roll back the changes that caused it. Continuous monitoring of system health will alert staff to any failure and they will then require processes, tools and permissions to enable them to resolve it.

Deployment frequency – simply the average number of code deployments or changes in an environment in any one day, it’s a good indicator of the efficiency of DevOps as it gauges the development team’s speed, level of automation and capabilities, and enables continuous delivery.

By delivering smaller, more frequent software deployments that reduce the amount of risks and changes in every cycle, teams are able to collect feedback more often, helping to initiate faster iterations.

This metric measures the overall efficiency of DevOps by calculating the speed of the team, what capabilities they have and the amount of automation involved, enabling them to respond to customer requirements that change frequently, make faster bug fixes, allow existing features to be enhanced, and reduce deployment risks.

DevOps is benchmarked against four standards in deployment frequency:

Elite – multiple daily deployments

High – deployments delivered at one-per-week – to one-per-month

Medium – deployments delivered at one-per-month – one every six months

Low – deployments delivered every six months or less.

By making deployments smaller, such as a single change or feature, not only are they easier to deploy more frequently, but errors are minimised and consequently will have a smaller impact.

Lead time for changes – the average length of time it takes DevOp teams to deliver code to the trunk branch to when it is production-ready and deployed, having passed all required pre-release tests. This metric is calculated from the time of the code commit to the time of the release and can be an indicator of early process issues. Measuring it enables teams to identify issues that slow down software delivery, how complex the code is, the capacity of the team, and its ability to respond to environmental changes.

DevOps is benchmarked against four standards in lead time for changes:

Elite – less than one an hour

High – between a day and a week

Medium – between one and six months

Low – over six months.

To reduce lead time for changes, trunk-based deployment, small batch working, code review improvements and increased automation can all be of assistance.

Conclusion

By measuring metrics to gauge continuous improvement, organisations can track speed, security and success, and allow teams to increase quality and accelerate velocity to deliver high-quality code and resolve problems faster. DevOps teams are now gaining similar status to areas such as design, development and architecture, and because of the enhanced levels of trust and collaboration involved in the process, transparency is increased, communication is enhanced and delivery is improved.

If you’re working at the cutting edge of DevOps and want to take your career forward, get in touch with 83zero.