Uncategorized Real-World Experiences Adopting the Accelerate Metrics – InfoQ.com
Stay ahead of the tech that matters: Attend in-person QCon London (April 4-6, 2022), or online QCon Plus (May 10-20, 2022). Register Now
Facilitating the spread of knowledge and innovation in professional software development
As ransomware and phishing attacks increase, it is evident that attack vectors can be found on the inside in abundance. Zero Trust Security can be thought of as a new security architecture approach where the main goals are: verifying endpoints before any network communications take place, giving least privilege to endpoints, and continuously evaluating the endpoints throughout the communication.
What is the single best API technology you should always use? Thomas Betts moderated the discussion, with the goal to understand some of the high-level features and capabilities of three popular technologies for implementing APIs. The discussion covers some of the pros and cons of GraphQL and gRPC, and why you might use them instead of a RESTful API.
In this article, author Juan Pan discusses the data sharding architecture patterns in a distributed database system. She explains how Apache ShardingSphere project solves the data sharding challenges. Also discussed are two practical examples of how to create a distributed database and an encrypted table with DistSQL.
Psychological safety is a work climate where employees feel free to express their questions, concerns, ideas and mistakes. We cannot have high-performing teams without psychological safety. In this article, you will learn practical ideas, interesting stories, and powerful approaches to boost psychological safety in your team.
At QCon Plus, Mathias Schwarz, a software engineer at Uber, presented safe and fast deploys at planet scale. Uber is a big business and has several different products. They are, in most cases, deployed to dozens or hundreds of markets all over the world.
How do traditional security approaches scale in Cloud Native architectures? Register Now!
Learn from practitioners driving innovation and change in software. Attend in-person on April 4-6, 2022.
Uncover emerging trends and practices from software leaders. Attend online on May 10-20, 2022.
Your monthly guide to all the topics, technologies and techniques that every professional needs to know about. Subscribe for free.
InfoQ Homepage Podcasts Real-World Experiences Adopting the Accelerate Metrics
Dec 17, 2021
Podcast with
by
In this podcast Shane Hastie, Lead Editor for Culture & Methods, spoke to Nikolaus Huber of Reservix and Vladyslav Ukis of Siemens Healthineers about the application of the Accelerate metrics in their teams.
Developer Workshop: Module 2 – CI/CD Pipelines.
Join experts to learn how to implement well-engineered CI/CD pipelines that consider governance and traceability from idea to production. Watch Now!
Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture Podcast. I'm sitting down today across the miles with Nikolaus Huber and Vladyslav Ukis. Niko and Vlad have both contributed to Q and As and articles on InfoQ about the application of the accelerate metrics. So we wanted to get together today and explore that further. Nikolaus, Vladyslav, thank you very much for taking the time to join us today.
Nikolaus Huber: Thank you as well for inviting us.
Vladyslav Ukis: Thank you very much. Looking forward to talking about Accelerate today.
Shane Hastie : Vlad, maybe we can start with you. Your article was about improving the speed and stability of software delivery at Siemens Healthcare. How did the Accelerate metrics help with that?
Vladyslav Ukis: So Siemens Healthcare, which has got actually a new name called Siemens Healthineers, we have got a platform for cloud based medical applications. That platform is called Teamplay and there are many applications on top of the platform. Inside the platform, we've been running a big transformation project in order to improve speed and stability of software delivery of the platform itself.
In order to drive that project, we needed some metrics to guide us. To see whether we are improving and to see specifically whether stability and speed is improving based on the changes that we have been doing in the organization at different levels at the organizational level, technical level, and cultural level. Accelerate came along and suggested a couple of metrics, and we used them and found them very useful in measuring the effect of the transformation. Both at the project level, or program level if you wish, and also at the level of individual teams and individual deployment pipelines.
Shane Hastie : Niko, you spoke about applying these metrics in your own SAAS product.
Nikolaus Huber: That's right. Yes. I'm working for one of the largest ticketing providers in Germany, and we offer to our customers, to the event organizers, a software as a service solution for selling tickets for their customers. I got in contact with these accelerate metrics before my job for Reservix. I have read the book before that. For us, it was a bit of a different story than for Vlad or for Siemens.
Nikolaus Huber: I remember that we had the discussion that we needed to improve our software delivery processes. But not directly improve in terms of more speed, but more quality. So the engineers, they said, we need to invest in quality. For example, we discussed about using the get flow to improve the quality. But I remember that the major advantages of our process was the high speed or the velocity of how quickly we could release software to production. So what I did, I remembered the metrics from the book and I thought, "Hey, let's assess the actual metrics and try to convince the colleagues that we are on a good path or on a good journey." If I can prove that these metrics, they show us that quality does not depend on software delivery or on the other way around. If you have a good software delivery process, this improves your quality. This is what the book tells you. This is what I wanted to show the colleagues. So this was the start for me.
Shane Hastie: For both of you, which of the metrics did you pick and did you implement them all at once or was this a slower phased implementation?
Vladyslav Ukis: We went all in and implemented all the suggested metrics. So besides the accelerate book, there is also another book that goes a little more in depth, in terms of the implementation of the metrics. That's called Measuring Continuous Delivery by Steve Smith. So we read that one as well. Then basically we implemented the speed indicator and the stability indicator from the book.
But the interesting thing was that we were assessing the transformation. So therefore we were accelerating the releases in the organization. Therefore, an individual pipeline for us can only provide measurements for team level improvements. The organizational improvements, we need the same set of metrics, but in aggregated form. What we ended up doing was implementing the metrics for basically kind of two levels. Team level, where a team can assess its individual pipeline through all the deployment environments that it's got. Also, at the organization level where there are, for the moment being, especially back then, bigger releases and where you can assess in terms of the big releases that you are doing. Whether you are still improving in terms of stability and speed. So basically we implemented all the metrics, but twofold. Team level, pipeline level, and organizational level.
Nikolaus Huber: Yeah, for us, it was not directly that we implemented the metrics. It was more like observing how the software delivery process works, how it performs. I started to get a grip by looking at the data I had at hand. So for example, the GIT history or the deployment job logs in Kubana. So at the beginning, it was quite easy starting with the deployment job log and I thought, "ah, that looks good. It's a lot of fun."
But once you try to dig deeper and to really prove that your metrics are correct, and you have only the data from GIT history, it's a bit harder and you are studio. I implemented my own scripts to derive the metrics from the GIT history and from the deployment job log. And still we have some gaps. For example, for the meantime to repair, we write incident logs and here the metrics are derived from these incident logs that have been written manually. There might be some errors in it. It still gives us a good feeling, a good impression where we are approximately. But it could be improved by implementing the metrics of measuring them directly from the tools we use. So there's work left to do, I'd say.
Shane Hastie: For both of your organizations, how do we make gathering these metrics and presenting them safe and easy for the teams? I think there's two points there. One is safe and one is easy.
Vladyslav Ukis: Right. So I'd say it seems at the first glance that the metrics are kind of easy to understand. Then once you implement the metrics and start showing them, especially to the engineers. So where I mentioned team level, pipeline level metrics, there are lots of questions popping up. So what is it, how is that calculated? Why is that important? Oh, that one is actually, showing something very actionable for me right now. That one, I'm not sure, et cetera.
I actually want to dig deeper and really present it to the people whose work is assessed using those indicators. Then it actually turns out that it's not that straightforward. It's not that easy. On the other hand, when we aggregated the metrics and presented them at the org level, this is where they were less questions because probably it's an aggregate and therefore by definition then see less details.
I would say, generally speaking, it would be great if those metrics were just implemented in the common application life cycle, ALM tools and we actually wouldn't have to implement that stuff ourselves because then first of all, the understandability would be there because it's just part of the standard package that everybody uses. Also, you wouldn't have that potential for mistakes and errors like mentioned by Niko. So overall, it's something to strive for I think for us as an industry. Basically making those metrics just available out of the box so that there's less room for interpretation and there is then less of a learning curve that the engineers would need to take in order to understand, trust, accept these metrics.
Nikolaus Huber: Our team, of course, the colleagues, they were proud that we have relatively good metrics compared to others. So this was good. This was quite a success. On the other hand, and here, I would also like to hear Vlad's experience, the interest in these metrics was not that deep that I would expect. Maybe because some colleagues, they are not interested in improving the software delivery performance. They are trying to improve the software itself or something else. But I think on the management level, if you try to get good results, get good output, new features of valuable features for the customers, I think this is important. But on the engineering level, I would expect more interest or more excitement for this metrics. But maybe it's just me who was so into this whole process and this metrics, et cetera.
Vladyslav Ukis: I would second your experience Niko. So there are engineers who are interested in this, and there are engineers who are not that interested in this. I think that's just normal life. That some people interested in one thing and other people in others. But generally speaking, I found was that especially the failure rates indicators like build failure rate, deployment failure rate, that was very well received by the engineers because they kind of work in a certain working mode team by team. That's kind of how you work in a team. That's how you submit plural requests. That's how you review them usually. That's how you deploy to your particular environment and you just do it. That's kind of your workflow.
Then suddenly a set of metric comes and shows you an assessment of what you're doing, of something that you might have been doing for years. And you're like, actually, there's this environment and there is so much failure in terms of deployment there. The deployment failure, right there is so high. Why is that? And then they are kind of trying to understand why that might be. And it's like, "oh, okay. Yeah, because there we've got something manual and it's actually not difficult to fix. Therefore, we kind of haven't fully automated that." And I'm like looking at the metric and seeing, "yeah, the recovery time is very short there, but actually, if you automate that fully, then you don't even need to recover," right? So you don't even need to fail and to recover fast.
Basically it kind of opened up definitely a new conversation and led to some improvements for sure. On the other hand, there are then some metrics that are less interesting for the engineers. Especially those intervals, for instance, mainline commit interval, and deployment interval, and build interval. These are kind of the engineers reacted with, "yeah. That's kind of interesting also, but it's not kind of directly actionable." I think if a team really embraces the delivery improvements, in terms of stability and speed, then I think all the metric would start making sense to them.
Vladyslav Ukis: But initially once you show it to them, it's like, "oh, that much build failure rate or that much deployment failure rate, why is that? We are kind of not expecting that." That was the reaction that I saw. Another thing that was very useful to me, sometimes I'm pulled into assess a team's delivery process. In the past it was like, okay, so let's look at your practices. So are you doing behavior driven development? Is your deployment automated? What's your say way of recovering from failures and things like that.
Now I'm coming with a weapon, hold the tool that can just assess their pipelines and then they start. That's definitely a head start compared to before, because now I'm just coming with those metrics. So we call them indicators, so that's important. We don't treat them as KPIs, but indicators they're called. And the tool is called them contains delivery indicators. So then we immediately go, okay, so which pipeline you want me to assess? We type that in, the metrics are there. The speed metrics, the stability metrics, and this is a huge conversation starter. Then from there, you just navigate wherever they want to go and then explore. And that's really great.
Shane Hastie: Delving into that one. It's got to be safe to have those conversations and you touched on it that these are not KPIs. How do we keep that safety and make the conversation about improvement in the process rather than assessment of the people?
Vladyslav Ukis: I think this is key, absolutely. It needs to be clear to everyone in the organization that this is absolutely not used for any people evaluations for any performance reviews. So basically it doesn't affect anyone's career. Also, I think that's fair because the smallest bits that you can assess there is a deployment pipeline and a deployment pipeline usually is owned by a team.
So therefore, you need to start with the granularity of a team. It's not a level of each individual. So it needs to be clear that no people manager uses this for any kind of people related stuff whatsoever. Once you've got that understanding, then that's a good starting point for people not being afraid of these indicators existing and really having open conversations about them. I think a good indicator is if people are open to talk about the indicators that they're seeing. Also, another good indicator is if those indicators are just public information inside the organization. So basically, you can look at anyone's pipeline and see how they're doing, and people are open and can talk about this without kind of hiding or without kind of trying to put that conversation under the rug.
Shane Hastie: Niko, what's your experience been there?
Nikolaus Huber: In our context, we have four teams working on the same code base. So in my experience, when we assess these metrics, it's kind of an aggregation for the whole team. Of course, you could measure, for example, the lead time for a certain person. But if you don't do that, and if you do that on the product level, I'd say it's easy to talk about it. So for example, if you look at the meantime to repair, you can simply talk about how can we reduce the meantime to repair and you don't have to point your finger at a person or someone who caused the actual failure. But it helps to direct the conversation to the actual problem and not the people who caused problems. So it gives you an objective level to talk about things you can actually improve.
Vladyslav Ukis: Another thing that comes to my mind is so actually the tool that we built, it's totally fully on top of the ALM solution that we are using, which is Azure DevOps. So actually, all the data is freely available anyway in Azure DevOps. It's just that it's a different representation of the data. So that then representation sparks new conversation. But actually in terms of data, data points, there's absolutely nothing new. So what's available through ALM Tuning is then also available through the new tool that's just provides a new representation. There's no buffering in between and it just direct pulling the data from the ALM.
Shane Hastie: What are the biggest benefits that you would say your organizations have achieved from bringing in these metrics?
Nikolaus Huber: In our company one for our teams, I'd say the greatest benefit was trust and confidence that we are doing a good job. So that the process itself it's working. Of course, we have problems, or let's say room for improvement, and the journeys, of course it's not over and we can still be better. We can deliver software faster and with higher quality. But overall I would say it gave us the confidence we needed to, that we are on the right path on the right track.
Vladyslav Ukis: Right. So kind of similar also for us. So as I mentioned at the beginning, we were assessing our transformation. So this enabled us actually to assess the transformation, in terms of speed and stability. So we can actually see over the years where the stability used to be and where it now, what was the trend and the same applies also for speed. So that's number one.
Number two, the team assessments that I mentioned before they are now kind of at a more professional level, because we start from the real data that assesses the team's pipelines. So that's really crazy. Basically move away from what are your practices? To actually, what are your outcomes in terms of stability and speeds of your software delivery? Then if you see bottlenecks with the outcomes, actually to also cannot automatically detect the bottlenecks, which is great. So then you go to the practices and start talking about, okay, so if you change that practice, then it's going to move that metric. Then it's going to have that outcome on the user. Which is either stability or speed would improve. So that's really important.
So basically stability and speed is important for the users, for the customers. So you are actually then working on the real outcomes that move the dial for the customers and not on some technical improvement because you don't like big requirements and therefore you break them down using BDD. The customer doesn't care about your BDD. So important thing I'd say for us is that, because it's not a maturity model with the defined levels, level one, level two, level three, it supports continuous improvement. So actually any pipeline that we look at, there are bottlenecks. There will always be bottlenecks in terms of stability and speed. It's just that those bottlenecks, they can be either big or small depending on the maturity of the team. Therefore, regardless where the team is, we can open up the two, three months later and see whether they improved compared to where they were. So that's really great I think.
Shane Hastie: What are some of the gotchas, the mistakes that people should be looking carefully to avoid?
Nikolaus Huber: For me, the greatest gotcha, that's also in the book, is that you don't have to trade in speed for quality. Most people think that if you want to improve quality, you need to reduce speed to get the quality. But what the authors of the book say is, "No, you need to go fast. If you can deliver software fast, you can also have great quality or good quality."
Vladyslav Ukis: Yeah. So definitely that. That's also in the name of our article that we published. That you improve stability and speed simultaneously. I would say, another thing is in the details of the implementation. So the implementation of your tool needs to be trustworthy. In a sense that once you get the engineers to look at the tool, they will immediately want to go into their raw data. So you need to enable them to jump from the indicators presentation to the actual raw data in the ALM, because otherwise they'll have lots of questions that you'll not be able to answer on this spot because the tool does lots of aggregations and so on in order to do their presentations of the indicators. So that really needs to be waterproof and you need to be able to answer the questions about how indicators are generated and why is that the beneficial way.
In the detail you'll need to come up with some conventions, some things that the tool does, because this is just the statement of the tool, because it does it this way and you need to be able to justify why that choice was good. Also, be open to feedback in terms of, okay, so why don't we change it in a certain way, because then it kind of reflects the reality in our organization better.
So generally speaking, as I mentioned earlier, I'd like that to just be implemented in the ALM tool, because that then would remove and reduce the amount of confusion, uncertainty, distrust, et cetera, in the tooling that you're building, because you also need to be aware that there are lots of other tools and actually lots of companies that claim to measure software delivery performance, and you are kind of now running against them. So you'll be also confronted with questions like, "Why are we not using the company X, Y, Z and now implement something on our own, which might be buggy." Then you need to justify, "Well, actually we are implementing this accelerate." Then why is accelerate better than the other thing that existed already five years ago? That's kind of maybe more bulletproof and so on.
Nikolaus Huber: That's a valid point. You need to understand how the metrics work. You need to be able to justify your results. I thought it would be easier to implement and to assess the metrics. But once you go down the rabbit hole, then it takes much more time than I thought at the beginning, but still I learned a lot. So yes, that's also a good shot for me.
Shane Hastie: What's missing from the accelerate metrics, if anything?
Vladyslav Ukis: The entire SRE stuff is missing. So the whole reliability engineering is missing. Actually, in the latest DORA report that's slowly making its way into those metrics as well. So they started already about a year ago or so with the addition of, I think it was, availability. Now in the latest one, they extended that to, I think maybe reliability in general, but at least more than just availability. So that entire operational aspect, so that entire SRE, error-budget based decision making, that missing. I think that's essential and I think they also recognize that and have started adding that now.
Shane Hastie: Are you measuring that?
Vladyslav Ukis: Yeah, definitely. So we've got an entire SRE infrastructure to measure that.
Shane Hastie: Niko, any thoughts?
Nikolaus Huber: Of course we measure availability. But for me, this was covered by the meantime to repair, or basically disability metrics. But it's a valid point. So of course, there's room for improvement and to assess more details. But I think we start first with improving or automating the assessment of the meantime to repair, or the stability metrics in general.
Shane Hastie: Coming towards the end of our conversation. First thing, is there any advice you would like to give to listeners who are thinking about following in these footsteps and bringing in these metrics? We talk about the gotchas, but what should they do?
Vladyslav Ukis: For me, it would be to start with measuring the software delivery performance, using Accelerate as soon as you can, because that was also our learning of the transformation. Actually, we started before Accelerate came out. Therefore, we are transforming for a couple of years without a good set of more or less objective indicators. Having those definitely helps because when you are running a total delivery transformation, then for a long time, you are investing lots of efforts and consequently money, of course, without visible improvements on the outside. So basically you are rearchitecting or implementing deployment automation, et cetera, et cetera. But from the outside world, to the people not involved in the transformation, things are kind of the same. Until you hit a point where you've accumulated so much change, that it becomes visible to the outside world that something is improving. Having those metrics can actually prepone that point where also people outside of the transformation activities can start seeing improvements.
Nikolaus Huber: I would also say start measuring the metrics or maybe just guessing them by looking at your processes, how you work, how you deliver software. So don't make the mistake like I did at the beginning and invest too much time in automating the whole stuff. Start with learning and applying it the easy way and then continue with continuous improvements. From that, I think you learn a lot. If you look at the book, if you read the book, if you do that, you get a very, very good experience how great software delivery processes can look like. So that's my personal experience with the book.
Shane Hastie: Thank you very much. If people want to continue the conversation, where do they find you?
Vladyslav Ukis: LinkedIn is a good place to for in me.
Nikolaus Huber: Yep. LinkedIn or Twitter.
Shane Hastie: We will include both of your LinkedIn and Twitter links in the show notes. Vlad and Niko, thank you very much for taking the time to talk to us today.
Nikolaus Huber: Thank you.
Vladyslav Ukis: Thank you.
Learn how to solve complex software engineering and leadership challenges. Attend in-person at QCon London, (April 4-6) or attend online at QCon Plus (May 10-20).
QCon brings together the world’s most innovative senior software engineers across multiple domains to share their real-world implementation of emerging trends and practices. Find practical inspiration (not product pitches) from software leaders deep in the trenches creating software, scaling architectures and fine-tuning their technical leadership to help you make the right decisions. Save your spot now!
Building Effective Developer Communities
The High Performance Edge – What Makes Some Organisations Stand Out
Engineering Culture Trends Report 2021
Engaging with and Serving the Digital Seeker to Craft Great Experiences
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example
We protect your privacy.
You need to Register an InfoQ account or Login or login to post comments. But there’s so much more behind being registered.
Get the most out of the InfoQ experience.
Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p
Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p
Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example
We protect your privacy.
QCon, the international software development conference, is returning (in-person and online) in 2022.
QCon brings together the world’s most innovative senior software engineers across multiple domains to share their real-world implementation of emerging trends and practices.
Find practical inspiration (not product pitches) from software leaders deep in the trenches creating software, scaling architectures and fine-tuning their technical leadership to help you make the right decisions. Save your spot now!
InfoQ.com and all content copyright © 2006-2022 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we’ve ever worked with.
Privacy Notice, Terms And Conditions, Cookie Policy