Podcast: GitPrime’s CEO on measuring software engineering productivity

Feb 20, 2018 | Productivity

Earlier this month, Travis Kimmel, co-founder and CEO at GitPrime, joined Kishore Bhatia from SE-Radio to talk about measuring software engineering productivity.

They discussed:

  • How the software industry is transitioning to data-driven management
  • Why development metrics are important
  • Findings from the 2017 Software Developer Productivity Survey
  • How different team sizes and structures use metrics

Host, SE Radio and CEO, GitPrime, headshot

Head over to SE-Radio to catch the full episode. And be sure to subscribe to stay up to date on all things software engineering — each episode is either a tutorial on a specific topic, or an interview with a well-known character from the software engineering world.

Here’s the full transcript of the conversation:


Transcript

Kishore:
Hello. This is Kishore Bhatia from Software Engineering Radio. Today’s episode is about engineering impact and measuring productivity with Travis Kimmel. Travis is the CEO of GitPrime, a SaaS company offering engineering productivity measurement as a service. With a mission to bring visibility into software development process and bridge the communication gap between engineering and stakeholders. Travis is also involved with a lot of community work in the technology space. Welcome to Software Engineering Radio, Travis.

Travis:
Thanks. It’s great to be here.

Kishore:
Today we are going to talk about engineering productivity and metrics and specifically learn from your experience Travis working with engineering teams before, and now with GitPrime. Generally all the meaningful insights for the teams and business stakeholders measuring value and productivity. So let’s start with some background establishing the team. When you blog you talk a lot about engineering impact. We’re going to understand more about the common theme. Hopefully, you can share your experience and talks on metrics-driven engineering management with our listeners. But first:

What has really changed in IT, and specifically in the software engineering world, to have this renewed emphasis on measurement?

Travis:
That’s a great question. I think one of the things that we’ve seen over the last 10, 15 years is that almost every industry has moved towards quantification. If you’ve been in a board meeting, you’ll see Sales coming in, and they’ve got this set of KPIs that they’re tracking. Lead response time, and that kind of thing. Marketing has customer acquisition cost. Every department has a set of management metrics that they use to evaluate how they’re doing, evaluate productivity at scale. That’s super powerful. Engineering has been responsible for wiring up a lot of other industries to metric server management, and yet it’s still the last frontier itself. It’s not, you walk into a boardroom as a CTO and you’re really in there telling a narrative account of some stuff that shipped and some stuff that didn’t.

There’s really no benchmarks to compare this quarter to last quarter. So that’s the problem we’re really focused on.

Kishore:
Gotcha. That sounds interesting, and it’s really a problem that a lot of engineering managers definitely are dealing with. Give us some more insight into what’s the way that you’re looking at it from your experience.

Travis:
As with many products, it was initially a product born out of personal need. I was an engineering manager at a startup and started with a small team where we were very much in that garage band mode, where everybody kind of knows what everyone’s doing. It was super-fast, and we made decisions collaboratively. We scaled that team up to 10, 15, 20 engineers where we really started to feel the pain of not having hard data. This specific thing that you run into as you start to scale is everybody can’t know everything anymore. When you get to that point it’s hard to communicate with stakeholders, you get people who walk into engineering. Whether that’s physically in the room or metaphorically because you’re working remote and saying, “How is this feature going?” Answering that question can be super-painful if you have a really big team because it’s just not obvious.

There’s no way to look into a person’s last two weeks of work elegantly and say, “Oh, it looks like they’re making great progress.” Or, “Geez, they’ve been stuck for three days, and they’re just grinding their gears on a problem.” That was the impetus for the product.

Kishore:
A lot of things have changed between the way of doing software development, the life cycle itself as it were. We keep hearing about springs and getting teams more agile and delivering value to their customers and validating that feedback.

What were some of the ways engineering teams organize and stay retrospective with feedback loops before that worked and how has that evolved now?

Travis:
There was this funny state where you’d ask engineering for … Write up a big spec, ask engineering to go deliver on that and then you wouldn’t see them for three to six months if you’re lucky and possibly longer. Then often times the product came out looking radically different because no battle plan survives first contact with the enemy and that sort of thing. Agile, generally pretty awesome, was the idea of pressing those feedback loops. I think the fact that that’s become an industry standard shows how valuable that is. The thing that Agile does really well is a lot of this forecasting and making sure that stakeholders are read into the decision-making and socializing decisions as they happen. So there’s not this massive disconnect between engineering and stakeholders.

The thing it has not yet provided is a set of metrics to say, “Okay, it looks like our team is doing well.” Most of the Agile metrics are very based on self-reporting. So the story point is this currency that evolves over time for the team, but it’s weird to say, “Let’s go increase the amount of storyboards were doing.” Because everyone knows that the easiest way to do that is just over-estimate your tax. Every time stakeholders outside of engineering come in and say, “Okay, we’re going to standardize on 40 story points per week,” or something crazy. It just doesn’t really go well because that’s not what story points are intended for. And yet there is still this pervasive need to have some tractable way to have that conversation and say, “Okay, we want to take measured steps to improve our team. How do we do that?”

We think that that’s the next natural evolution of engineering management and how engineering gets feedback. The challenge there has always been that most of the easily harvested metrics about engineering are not super-valuable. This thing that everyone’s worried about which is that someone’s going to roll in there and count lines of code and use that as a proxy for value, and it doesn’t work. We all kind of know it doesn’t work because sometimes the hardest change is like a two-line change.

So the challenge here is how do we develop a set of metrics that can give us some accurate signaling for an engineering team and help us make changes that matter at scale. How do we show — if you have a stakeholder who wakes up in the middle of the night with a brilliant idea, a change to a product that you’re two months into building. That causes a lot of waste, it causes a lot of damage to engineering’s ability to deliver. The software industry would be well-served by a way to quantify that waste. If you can approach that problem and say, “Look, we’re happy to make these changes, but you have to understand we just invested $200,000 of product progress on this. If we take a right turn here, we’re going to be $200,000 and two months behind. And here’s a picture that shows you this is based on facts.”

That’s really our goal here, is to have feedback loops that are really good both for inside of engineering and then for communicating across these organizational lines and helping people understand what the work of doing software development is like.

Kishore:
Travis, you mentioned about engineering teams organized for agility and doing a lot of retrospects and feedback loops before and how different it is now. Isn’t everyone already measuring against a common company goal? Let me rephrase that.

Isn’t everyone measuring against their company’s goal? Why measure engineering differently?

Travis:
There’s value in doing both. You definitely want to have everybody rallied around common goals. That’s independent of engineering or anything else, that’s just a good practice.

In addition to that, it’s good to have a bunch of levers to look at Engineering and say, “Okay, here’s the area where our team is super-strong and here’s the areas where we can improve.” I think without being able to measure engineering-centric metrics it’s just really hard to figure out how to make things better. That ends up having the net effect of being a little disempowering to the engineering department as a whole. So when other teams are going into a boardroom with metrics that say, “Look, we have improved in the following three ways. 20% in this area, 15% here, and 10% here. And now we want more headcount.”

What that means is that it’s super easy for them to advocate what the ROI of staffing up their department is. When you get engineering coming in there with only narrative accounts, and no real tractable way to show the ROI of the engineering team, the unfortunate side effect of that is you’re basically depending on engineering managers to be politically savvy which is not really when engineers are at our best, right? We’re at our best when we come in there with a bunch of facts and data and say, “We shipped a bunch of stuff, we are slightly under-staffed here. If you give us three more headcount you can expect us to ship this much more stuff.” That’s a powerful place for an engineer to be because we’re talking about data and hard facts.

Kishore:
I see. You mentioned ROI. Just for the audience can you clarify what that is?

Travis:
Yeah. Return on investment. So if you’re considering a request from engineering like, “Hey, we need to go and do a bunch of refactoring for a month here. If we do that refactoring we’re going to move a lot faster.” Often times what people hear there is they hear the engineering lead saying, “Hey, we want to do stuff that you don’t value for a month.” Because this technical debt is really only felt by engineering, the pain is only felt by engineering. And it’s felt in our ability to deliver stuff quickly. But if you can put up a series of reports and show that the value of paying down technical debt and show technical debt being paid down over a month is that engineering moves a lot faster and is a lot more lightweight and can deliver stuff after we do that. And again, show that curve of productivity increasing after havingsay a couple sprints to address a bunch of technical issues.

That’s super-valuable and allows engineering to go and make these agreements with the stakeholders, and maybe it’s with the product team. Whoever is really wanting more new stuff and say, “Well what you buy by letting us pay down technical debt for a month here is the ability to get more new stuff faster in the future.” And then being able to back into that afterward and show that yeah we are actually shipping stuff 30% faster like we said. Very, very powerful.

Kishore:
I see some baseline and effort estimation to be able to get to a point where you can justify factually with what the return on investment would be for a particular effort, whether it’s rewriting a particular piece of code or refactoring existing code.

Travis:
Yeah, historical metrics are super-important just in general. In all of this stuff, the very first thing to do in there to do is go in there and say, “How do we develop a baseline for how we think about productivity?”

Often times the initial step there is just very kind of, brush your teeth, eat your vegetables-type metrics. Engineers theoretically want to be coding, the organization wants them to be coding. So how many days a week is an average engineer on this team able to code? You’d be surprised that it’s not five, generally, right?

Kishore:
Yeah.

Travis:
There’s a lot of overhead that tends to go into engineering. It’s more like two or three. So looking at something like that and having the rest of the org say, “Oh my gosh. When we call an all-hands meeting with the engineers that has a cost.” And the cost is that they’re unable to do the thing which they’re amazing at.

I mean it seems ridiculously simple, but it’s very difficult to look at that. So what happens instead is that people call these all-hands meetings, everybody kind of groans and knows that it’s not really great for their own personal productivity. They’d rather be coding. But there’s really no way to advocate for, “Hey, maybe only call the people who are necessary in here instead of having a $100,000 meeting.” A little bit of data in situations like that goes a long way.

Kishore:
I think that’s another experiential point of view that people think in generating productivity and measurement is an art. There is a lot of science behind it but there’s also emotions, communication tracking involved. I see your point about how meetings take over a lot of that while you actually produce real value, real code.

Travis:
Yeah. If you walk into an engineering team and say, “All right, how many people went to CS school?” Everybody raises their hand, we all raise our hand and say, “Yeah, we all went.” What you learn there is how to create great code. There is a lot of artistry to that facet of it. And then if you ask that same group of people, “All right. How many people taught you how to be effective in a work setting? How many of you in here were taught, ‘Here’s how you advocate for stuff, here’s how you work well with another group of engineers, here’s how you deal with an obnoxious set of requirements like put a mood button in the app.'” Basically nobody has ever been taught any of that stuff. It’s really not part of the curriculum. It’s something you have to learn on the job and that’s the kinds of things that we like to surface is these people-centric metrics.

Like, “one of the biggest problems in this org is just that engineers are in a lot of meetings and they don’t get the opportunity to do a lot of code.” Big, broad things like that.

Kishore:
Right, and you have to justify or at least create metrics that show how one thing adds up to value with another.

Travis:
Threading that stuff all the way too, and therefore this creates value for the business, is really where a lot of the work is.

Kishore:
Right. I actually went through your 2017 Software Developer Productivity Survey.

What were your main findings in that survey, and how did they generate insight into the various things that engineering teams should measure?

Travis:
Definitely. A lot of this was initially prompted by a product development. We went and we started with some very basic stuff like does software engineering even want metrics? There’s overwhelming response in the affirmative there. Everybody wants metrics. They just want ones that are good and relevant. Then if there were good metrics they’d be very interested in knowing them.

One of the really key questions here was, “what slows you down?” If you’re a developer working in the industry what are the top things that slow you down? A lot of the responses there were waiting for other people to do stuff. So human thread blocking. The bigger the organization the more you see this kind of stuff where a JIRA ticket goes into this holding state where you’re waiting for approval from legal or a sign off of some sort.And maybe you switch focus and work on something else but that actually soaks up a lot of time. Task switching is super-expensive. You can only track so many open items at a time.

So that was the number one thing that people replied with. Then there’s a couple others in here like meetings, inadequate tool sets. By and large we see that a lot of what’s slowing engineers down is stuff that is not necessarily happening inside engineering but in those boundary lines and the handoff places between engineering and somewhere else. Having data to communicate that that’s what happening is pretty powerful stuff.

Kishore:
Right. And the points that you mentioned about waiting on threads just like in software we’ve got a lot of congruency and sequential stuff that we build on. Is there some insight into how tools or engineering metrics can help bring those up?

Travis:
Yeah. When you have an engineering team whose time is being soaked up by non-engineering stuff there’s really … That’s not felt organization-wide. It’s felt by the individual engineers who have to deal with that and then also get all of their deliverables done. If you can quantify that stuff and say, “Look, our engineering team spent 30% of their time in meetings” or whatever, a bunch of other overhead and have that conversation in a more objective way that tends to be … The situation tends to resolve itself. It de-politicizes everything, it’s no longer about somebody’s opinion of how they want to work, it’s about waste happening that nobody really wants. So again and again what we see here is that data is engineering’s friend. No matter what that data is, it favors engineering because engineers tend to be facts-driven people.

We’ve all sat in the meeting where as an engineer we’re getting out-talked by the suits. We know for a fact that something this guy is saying is wrong and there’s just no way to communicate it. There’s a great YouTube video to this effect where they’re sitting in a room and this guy is saying, “We need you to make seven lines all perpendicular.” There’s some engineer saying, “That doesn’t even make any sense.” Over and over there’s the string where facts are very friendly to engineers because that’s what we like to play with. It’s the tools of the trade. Engineering is this very concrete things to do. And so the more data we can being into the equation when we get asked these difficult questions like why is everything late all the time? Engineers get asked that all the time. And frequently, the answer to that question is outside of engineering.

It’s not like everybody’s sitting there playing foosball. That’s sort of what everyone wants to think and it’s just not true. The answer’s a little more subtle than that. It’s like, “Well because we keep shifting our target. We’re not even clear down here what we’re building. The specs here are so loose that we’re just trying to go off of the last thing we heard. Surfacing those kinds of replies to why is everything late with hard data is very, very effective for helping engineering play the role they want to in the organization.

Kishore:
Right. I think the other point that comes in here is one of the more communication-driven, size-driven impact here on other teams’ organize and deliver is the size of the thing itself. I’ve heard a lot of good things about organizing companies within different units of product in generating sales or having very horizontal, layered teams with product generating sales all part of the same track, making decisions faster, everyone being in the same group at the same time can mean more quicker decisions.

Even the DevOps feel that there is a lot of value in making sure that teams have a shared goal so there’s not everyone throwing decisions at the wall or accountability is clear. But at the same time certain tasks get done faster when the teams are organized in the right manner.

How do you see that in the size of a company? Let’s take an example of a startup versus an enterprise.

Travis:
Sure. Well when it comes to metrics typically small teams don’t really need them. I mean we’re a metrics company so maybe it sounds weird for us to say that but a really small team doesn’t … You’re not dealing at a point where that stuff’s relevant at all. As a team grows the reason the data starts getting introduced into the equations is typically a team evolves a little bit beyond say five or six and there’s team leads. We take one of the engineers, we sacrifice them. They no longer get to code and now they have to talk to people.

Kishore:
Very true, yes.

Travis:
It happens over and over. That person needs to have some way to do that. They need to solve for two problems immediately. One is how do we communicate that we’re bringing value? How do we show people that work is happening down here before a feature ships. Because sometimes that can take a while. Then the second is how do I as the team lead here focus my efforts on a day-to-day basis? Do I just go randomly pick some commits or PR just dumpster somewhat? Or can I be a little more elegant and focus on the stuff that really matters? That’s sort of the first level of where companies emerge into into needing data is to have a touch of elegance with regards to managing a team. What are the important things to pay attention to today? I’ve got two hours here to focus on helping this team succeed. Where do I put that focus?

As teams grow even more into bigger and bigger, to an enterprise setting, really the role the data plays there from what we’re seeing is large-scale change. So if you’ve got … Let’s see, you’ve got a new process. You want to roll out Agile. So you do it across thousands of engineers and then people want to know how’d that go? We invested a bunch of resources in rolling out Agile. We don’t have anything to report. We sort of report on our feelings like everybody seems to be happier, I don’t know. Some people are sad, some people are happy. That’s just not … It ends up sapping the ability to do any sort of large-scale transformation and engineering only gets small stuff. So what we see is that large teams use this when we want to go advocate for say, “Hey, we need to introduce an entirely new tool chain here.

“And we’re going to do that work on a pilot project and we’re going to take these metrics and see whether or not that’s making any improvement, right?

Kishore:
Right, yep.

Travis:
And then we’re going to come back and we’re going to say, “Okay, well we got about a 5% bump in our ability to ship features. That seems good. The products’ cost is less than that so we’re going to roll it out broad.” So that kind of big-scale change for information whether it’s new tools or new processes needs a feedback loop. That’s where productivity metrics come in.

Kishore:
We’ve established the fact that visibility into engineering is a lot better when we actually do it as a data science, when we actually do it measured with a baseline and then iterate upon it. It’s also good for large-scale transformations like you measured of going like you mentioned, of going Agile as a company. And then with size the importance increases.

What do you think is more important from an engineering manager’s point of view when his or her team grows from a size, as you said bigger than five to slowly start getting into investigating into the more critical baselines and then building up those … I wouldn’t say those metrics yet but at least the process is to get that cultural alignment.

Travis:
There’s a lot of different starting points and there’s a lot of different ways to be successful. One thing that’s generally true is that KPI-span is bad. So the last thing you want to do is go find a bunch of stuff you want to improve and introduce all that stuff.

Kishore:
If you can clarify what is a KPI?

Travis:
Oh yeah, sorry. KPI, Key performance indicators. People say, “Hey, we’re going to target these areas for improvement and we want to see everyone be better by the end of the quarter,” whatever. So these KPIs, key performance indicators, get rolled out and say, “Okay, this is what we’re targeting for improvement.” That’s good, that’s a good way to use data. What makes that successful is focusing on one thing at a time.

The important thing to remember when using data for any kind of management or change stuff is that that’s something that everyone has to pay attention to in addition to their job. Everyone has a full job, they got deliverables every day, they’re responsible for all that stuff. And then any sort of global improvement is another layer of overhead on top of that. So you don’t want to layer in like 15 things that we’re going to improve by the end of the quarter because everyone will just improve that as a total beat-down.

So what you want to do instead, and what we see successful teams doing, is picking one thing that will be a high-value way to improve stuff. I keep coming back to this example because it’s a good one. Let’s get everyone out of meetings and back into the code. Let’s try that for a month and see how that goes and see if that creates a change that we like. That could be a very simple kind of KPI. We’re going to try to maximize the amount of days that everyone was able to actually go and do something in the code. Like independent of whatever that is, check some coding because that’s what we’re all here for and that’s what we like doing. I guess the broad counsel there is no matter what size team you are if you’re starting to do this data stuff start really small, introduce something like some single interesting thing, ideally that everybody …

That addresses a pain that the entire team feels from people outside of engineering to engineers themselves. And then target that one thing for improvement. Once that’s successful, move onto the next thing but don’t go in there and spam out a bunch of initiative stuff all at once where it’s just going to create a lot of resentment.

Kishore:
Right. So I think it’s starting small, measuring it and then high trading point those indicators to create a sense of small KPIs.

Travis:
Yep. It’s just like going to the gym. Anybody can go to the gym but the reason you hire a personal trainer is so that they can go in there and say, “We’re going to work on your core first.” And not go in there and just have a million things to do. It’s really about breaking the stuff up into accessible chunks and making the problem, whatever the thing you’re working on something that is immediately addressable and tractable.

Kishore:
So Travis, we’ve talked about KPIs and we also established how important it is to measure engineering productivity importance, small teams, large teams, also during transformations into established baselines to be able to get a point of view on how we have improved time over time. How do you measure the engineering impact when we get into the data aspect of it?

Travis:
I think there’s a lot of ways to go about that. There are a couple of interesting examples we can look at here. There’s this great Sufi story about how wisdom is not universal. If someone in the future came and they found a … What is that, the DSM4 that has all the prescriptions in it and they looked at that and they were like, “This is all medicine,” and they didn’t have any context. They could basically start giving everybody medicine without understanding the disease. That wouldn’t work out really well.

A lot of what we advocate for is developing a set of metrics that really address the one or two things that your team is focused on. For example, we worked with one team that have this problem where they were engineers … It felt like all the engineers were stumbling over one another.

When we looked in at their data what we found is that there’s this pattern with a culture of shipping monoliths. So an engineer would go, they work on something locally for a while, they’d have it out for a week or two and then they would check in this big … They were shocking in one massive commit which would basically create a merge conflict for every like other engineer on the product. That was super-ineffective because you get a bunch of operational overhead, people kind of stumbling over one another. We worked with those guys to say, “Look, you guys need to move more towards this taking smaller bites methodology of let’s break all our work into pieces, check in five, six, seven, eight commits a day, push that stuff so that other people can rebase on it and have break points here that provide a reasonable way to back out if you run into a big conflict.”

Integrate your work more often. Just like you get people … Programmers communicate at the level of code pretty frequently and so if there’s this big gap between when you’re communicating, “This is what I’m doing” literally at the level of code, not with words, that can be super-problematic. So when we go in there we’ll look at are people just checking in these whopper commits all the time? Is that something that this team needs to improve or are they doing pretty good with granular delivery? Granular delivery tends to be a really big theme, pushing stuff up to where other developers can see it and comment on it the way that people do code review. Or large bodies of work getting sufficient code review or is that stuff getting rubber stamped? Those are all metrics that we tend to advocate that people look at and pick the one or two that can address something very immediate for the team.

Kishore:
Gotcha. So this is more iterative in approach to continuously check in small amount of changes and keep building upon it and integrating with team.

How should a manager introduce data to their team?

Travis:
So there tends to be a first phase which is largely observational. The first phase is just to say, “Look, we’ve got this tool. It provides so much data about engineering, it’s just data. Let’s look at it.” The first step is to get a baseline for where things are at. So the first step towards using data to make life better is not using the data and taking an observational stance and not trying to go in there and make a lot of assumptions. Once you’ve got this feel for where the team’s at, what ‘normal’ looks like you can say, “Look, it seems like this would be a good area for improvement. What do you guys think?” We’ve seen a bunch of teams roll this out and the first thing… The best way to use data initially is just to help engineering get something it wants where engineering is a team.

That could be, “Hey, we want better requirement stocks.” We hear that one a lot. We want to use this data to help people understand that a more fleshed out requirement stock leads to a faster implementation.” So something like that, it’s a great place to start.

Kishore:
Right. Did that vary between the engineers and the team like the role they played, junior engineers, senior engineers, team leads, managers?

How does that visibility matter to a junior engineer vs. a manager?

Travis:
We focus a lot on team leads largely because they have people problems which is a lot of what we focus on. Meaning it’s very hard to figure out how to provide value for your team lead to 20 people. Who needs your attention today? If you think about walking into a sprint or a daily standup everybody in the daily standup tends to say, “Here’s what I’m working on.” But there’s no real visibility into whether that person’s stuck and has been grinding on a problem for three days or whether they’re making a lot of progress. Having a data layer on top of that to say, “Hey look, do you need a hand here to jump in and get you some resources? Can I ducky this problem with you?” That’s pretty valuable. As an engineering lead it can be challenging to figure out how to be a good actor because you have this double bind.

Interrupting an engineer while they’re working is super-costly. Like you’re holding this glass palace in your head of abstractions and trying to figure out how to write that up. And then someone shoulder-taps you and says, “How’s everything going?” Very disruptive. But not interrupting somebody who’s been stuck for a couple days and could really use someone to talk over the problem with is also fairly disruptive. So when you’re a team lead it’s just … It’s very hard to calibrate on how to be a good actor. Jira just tells you where people are oriented and it’s a valuable tool to get a read on what everyone’s doing but it doesn’t really give you any insight into the engineering pipeline and the flow of work. That’s really where we think that metrics are needed.

Kishore:
You mentioned JIRA here and that’s a very important tool that a lot of teams are not just using JIRA but also very similar task tracking, bug tracking and story tracking tools being used in more and more Agile teams today. How does that translate into some of the baselines that the lead can get in when it comes to stand ups and tracking just the spring delivery itself?

Travis:
I mean I think it’s an essential tool at this point. I don’t know if we’ve ever run into a team that doesn’t use some form of issue tracker.

Kishore:
But you mentioned that there is no proactive visibility into how the sprint itself is going and if there are challenges?

Travis:
I think a good metaphor might be Jira is like the navigation in your car. It tells you where you’re going, gives you an estimate on when you’re going to get there, gives you some directions sometimes.. It’s navigation for where you’re headed. It’s not the speedometer or the gas or the fuel gauge or the temperature gauge or the temperature gauge to see if the engine’s heating up.

All of those are also needed if you’re going to be in the car driving it. That’s how this stuff plays into the team. Jira is the nav, it shows where we’re headed, it shows when we expect to get there. What it doesn’t do is allow you to figure out if the car is healthy. If it’s low on fuel. Are people getting worked too hard here? Are we under-staffed? That kind of stuff is just … It’s not really a Jira problem.

Kishore:
Well that’s pretty powerful for a software data-defined metric to work with software engineers, managers, team leads and give that insight. I’d love to dig deeper into how do you get to that meaningful data beyond the tools that are used day-to-day in our software development life.

Travis:
As an example, one of the things that we look at is code churn. So code churn is something that we mine out of Git and it basically … Let’s use a non-code example here first. Imagine you write an email and you look at the email and you’re like, “This is no good.” So you throw it out and you write another email, a second draft. You look at that one and you’re like, “This is also no good,” so you throw that out. You write a third email, it’s good so you ship it. We would say that there’s three emails were written, one email was shipped and there’s this delta of two churned emails which is interesting. And so we look at that at the level of code. We look at when engineers are rolling along, this is just a normal level of rewriting your own stuff. Like you try something, it doesn’t quite work out.

But then if you see an engineer sort of bobbing along and then there’s a massive churn spike that’s generally a leading indicator of a problem. Because if they’re going along and they’re working in the code base and all of a sudden they’re spending a week re-factoring stuff that they just checked in. That’s usually a sign that something is wrong. If you look at that and you say, “This person is not really prototyping a new feature so what are they working on? Are they having a hard time getting this thing in, is it a really loose definition on what the problem is?” So we look at that. That’s kind of the … You could think of that as the engine heat gauge. Like how hot is this particular leg of development running? Another example would be we go back in the codebase and we look at globally if you take a team of say 50 people what percentage of code-based changes are aimed at refactoring legacy code?

And again, there’s just this normal baseline, teams tend to have a normal amount of, “You’ve got to go in there and tweak some old stuff. Why are you shipping a new feature?” But if that starts to climb above about 50% you run into a really dangerous situation where if over half of your focus on the codebase is refactoring old stuff that’s just going to snowball until all of it is refactoring old stuff. And so what happens is teams get into this weird cycle. They start throwing headcount at the problem because they want to shift stuff faster. But if you can’t keep the level of just refactoring with legacy code below about half. That becomes problematic and really the right thing to do there is to set everything down, do a large architectural refactor so that that doesn’t become this little cost that, death by a thousand cuts that everybody has to pay forward as they try to ship new stuff.

Kishore:
That metric is pretty powerful if you can actually start looking at various code churn and as you mentioned through the commits that are being made by engineering to a particular story.

Is it just code that you look at, or are there other behaviors that could be tracked through the release frequency or other meaningful data?

Travis:
In that example, it’s just code. Churn is specifically when I as a developer am rewriting stuff that I just checked in. There’s this notion of authorship — whose code is it? But that’s largely mined from version control. We do also look at a couple other sources of data currently and some interesting R&D stuff we look at a lot more. We also look at the data from the Git host so we’ll look at things like how many people are being pulled in for code review? If we look at a team of 50 engineers or something and we can see that there are two or three engineers who are always reviewing every PR that’s a huge bottleneck. That happens quite frequently. You have these senior engineers who everyone feels like they have to have that person’s blessing. So spreading that out is pretty valuable.

Again, it has nothing to do with the nuts and bolts of software engineering. It’s more the operational and people side. But it’s a huge opportunity for improvement.

Then we also look at ticket data. We pull in data from Jira and we typically use that to think about things … This is not quite shipped to production yet but I’ll leak it. We have one that we’re working on currently called Ticket Jitter and so this is a way for us to quantify how much change happens to an issue, say a feature that you’re building, while that issue is being actively developed. So we tend to think of this as bad, like the perfect implementation path is you get this requirement stock that answers all the business logic and you go on coding along your merry way. And then at the end the feature ships and everybody’s happy with it.

And any deviation from that is a little bit of brain damage. The most brain damage is if you’re working on a ticket while the stakeholder on that ticket is actively modifying the goal, the requirements and moving the goal posts. One of the things that we look at when we look at ticket data is the way to quantify that and say, “This ticket had a lot of jitter and that’s why it’s late. It’s not late because the engineer was screwing around or we didn’t prioritize it or whatever. It’s late because the goal posts here were a moving target and we just kept coding towards that moving target the entire time this ticket was in process and that’s super-wasteful. And the number of times it changed was 50 deviations in here, just a ton of commentary, it was a ton of clarification. So if we want to ship stuff faster what we need is for that stuff to be resolved before work starts.”

Kishore:
There’s pretty powerful correlation between commit frequencies and how the ticket data is actually being correlated with it. In fact that will basically become part of my next couple of questions as we move along which is in other words how do you build context for such an insightful metric for the stakeholders that are involved in making higher-level or lower-level decisions for the team?

I’d love to drill down more on a bunch of examples where you mentioned Git, you mentioned Jira, you mentioned ticket data that would probably come from something like Zendesk.

How does all of this get aggregated in a way that provides insight around how we can improve?

Travis:
I mean the shorter answer here is pictures. We do a lot of work in D3 where data vis is one of our main focuses. I think one of the things that we as people who work in engineering kind of undervalue is the power of high-visibility stuff. Over and over we see value in people who can do regular updates that are visual. If you put together a release deck and let’s say there’s a ton of work that goes in there but you screen cap the noticeable changes in the app and you push that deck out, say, “Here’s what we shipped.” You do that frequently, that is so valuable.

So we take a similar approach when thinking about productivity. We believe that the best way to communicate that stuff is with a picture. If you’ve got a bunch of little bars and then you make a change and you’ve got a bunch of bigger bars that tends to be good.

Or if you can say, “We hear these stories from engineering teams where someone walks into engineering and says, ‘I need a third of your capacity for this project. I’ve got some imperial feature …'” Imperial features are something that we use internally to talk about a feature that has a very powerful stakeholder that just comes in and says, “We’re building this now.” That happens all the time. If you have someone that walks into engineering and says, “I need a third of your capacity to build out this feature.” Then you say, “Okay, great.” You put some people on it and then two months later you’re sitting in a meeting and that guy’s going, “Well you didn’t really give us a third and that’s why we didn’t get there.” If you can pull a picture and say, “Actually we gave you 36.9% of our total capacity measured across these three different ways of measuring capacity, so we did deliver here. We’re putting an appropriate amount of attention on it.”

Again, data has this effect of de-politicizing everything. That’s just delivering data insights through pictures and showing this is the work moving through the pipe. That’s a lot of how we focus on making this accessible to less technical people.

Kishore:
That’s a very powerful statement there, data de-politicizes everything. There’s so much in context that applies to pretty much everything that we do as human beings and not just a software industry but overall business process management.

This was powerful. I think I’d probably love to put some links if you can provide me about the dashboards and the graphics that aggregate this type of data into more of a demo video maybe that a lot of engineering teams, managers and every individual that cares and is going through such an impact would love to have these.

Travis:
Absolutely.

Kishore:
So Travis. Thanks a lot for sharing that aggregate value and the graphics that you can pull from such an insightful data stream that you’re collecting from different data sources on a day-to-day basis. These are all tools and metrics and processes that we run through. But this is really very valuable data that helps makes decisions.

Travis:
It’s been a while driving me. Our big goal here is always to thread the needle between all this data which is super-noisy and then business value. How do you correlate that with business values?

Kishore:
Right. There’s a lot of times when refactor or relight is an afterthought or it’s already happened or happening. By the time you discover why things are late or so complex you have already ended up starting on that journey. It helps if we can actually cover some of these examples on a proactive basis.

What else do you think an engineering manager can communicate around to be more proactive with historic context?

Travis:
One of the things we focused on that’s on here is how you take this giant wealth of data, which is not inherently all that valuable, and get rid of all the noise and find the signal in there. That stuff is pretty challenging. The thing that businesses always want to know is thread the needle between all of this Git data you have and show me where the business value comes in. There’s always this disconnect between what engineering is doing and how that translates to business value. An example of that that we always refer to internally here is “The Green Button Problem.” You’ve got this, let’s say you throw in a Jira ticket and you’re like, “I want all the buttons on this site to be green now.” You look at it and you’re like, “Yeah that’s probably …” I don’t know, whatever. A little CSS change, maybe it takes a half a day. So that work gets greenlit.

And then an engineer picks up that ticket and because they’re a good engineer they pick up that ticket and they’re like, “Well the right way to make all these buttons green is to go rewrite the entire templating engine while I’m in there. Because that’s way overdue and we need to do it.” The reason that they have that impulse is because good engineers want to get in there and generally improve the code base for everyone. But that stuff, there’s often a massive delta between the ticket that an engineer picks up and how they approach it and what was greenlit. So a lot of what we provide is the ability to catch that stuff while it’s happening, while it’s in the pipe. If you look in there and you’re like, “There’s this ticket. There’s this JIRA ticket that’s like I turn the buttons green and I can see that that guy’s in there doing a bunch of refactoring of legacy code and has been for two days.

That doesn’t really match with my experience of turning buttons green. So being able to catch that stuff before you’re three weeks deep into this massive template engine refactor and people are like, “How the hell did we get here?” That’s super-valuable. Again all of this stuff is everybody following the things that make them great. What makes engineers great is the ability to tackle the real problem, not just the one that they were asked to tackle. And then what makes a team lead great is the ability to get in there and say, “Well look, I get it. Let’s go see if we can get some resources to address this templating thing, because it’s a nightmare. But not right now. Today make the buttons green.” That prevents a lot, it’s like problem interruption. You don’t get to this point where people are dissatisfied with engineering and you generally have just a much healthier dialogue.

Because people are able to communicate about the actual value of a task and what that looks like in the engineering pipeline.

Kishore:
I see. The technical debt aspect also comes up most often in these conversations when people have historical context in the team. I’m especially interested in also knowing how does this help new engineers when they come on board and we’re slowly going towards the digital and the millennial context here. To understand what is more productive at this point in time versus sometimes it’s also about what is the right thing to do.

Travis:
I mean one of the things that’s kind of nice is the ability to look at something that happened a long time ago as a reference point. If you’ve been on a team for more than about a year or so you all remember that period where … I don’t know. Maybe you had to deal with someone else’s code and clean it up and it was a nightmare for a month. Being able to dive into data forensics and go back and say, “What did that look like? What does it look like when our team gets tangled up in a really deep refactor,” from the perspective of all these metrics. If you can go back and find these canonical examples for your team of a month of brain damage in March because we all got caught up in getting rid of Angular and moving to React or whatever it is then you can get a feel for what that pattern looks like when it starts to spin up again.

You can head those things off and say, “Look, it seems like we’re getting ensnared in another one of these issues. Let’s figure out whether or not we actually want to tackle that now or just get this deliverable off and then do it in a month,” or something.

Kishore:
Also you mentioned something about gamifying for the sake of providing rewards at an individual level. That’s something I’ve seen before with CI tools and how different badges in the world get accumulated with unit test coverage and how frequently are you checking something? The only thing I was looking at in that sense was does this measurement continuously improve the team as a collective versus an individual with some more positive reinforcement. You clarified that it’s more of a collective team effort and you want to make sure that you’re making these decisions together with factual data.

Travis:
Our general approach here has not been so much to focus on the individual engineer. We don’t want people obsessing about this stuff. In fact we’ve strayed away a little bit from any kind of gamification for the individual. It’s sometimes the beginning of a disturbing trend. It’s a thing that kind of lends itself to abuse. You don’t want to infantilize your workforce. Games are awesome, I love games. But there’s this other component of work which is distinct from that and everyone’s trying to rally around a work goal. It doesn’t need to be dressed up as a gamified metric. So we’ve tended to resist that and focus more on metrics that’s like how do you … If we presume that there’s a manager or a team lead here who really wants to empower their team and find out what’s blocking the team and find out what’s holding them back and get rid of all that stuff what metrics would we build then?

That’s entirely where we focus. From our view, a new engineer to your question here, a new engineer coming into a team, what they would experience in a team that has really robust metrics is they would have a lead that had a pretty darn good idea of one, where they could jump in and provide value. Like, “Here’s where our team is a little weak and we need you to shore that up.” And then two, how to onboard them quickly. They would know what good onboarding looks like because they would have this history of people who were onboarded well, what that looks like and what areas of the code base they were in, how we did that. A lot of how the individual engineer experiences value in a data-driven management setting is by managers doing less damage. Maybe that’s the take home point here is, ideally what we’re doing is improving a manager’s ability to get out of the way and get other obstacles out of the way on behalf of the team.

So it’s not really a direct consumption of value by an individual engineer because that stuff can get a little weird. If someone comes in and says, “Hey, what I really need you to do is check in ten more lines of code on average every day.” Like, “Okay. I’ll make my methods longer.” That’s not really the pattern we want to encourage.

Kishore:
Totally. I think earlier in the session we were talking more about how small teams and large teams look at metrics differently, look at productivity in a different manner. There’s also a lot of data that gets generated with tools and processes as individuals create more and more value-driven sprints. How does the overall model work when the team itself is totally the more…

Are there certain specific examples that you can bring on for just remote teams?

Travis:
Yeah. Ideally yeah. The way that the individual narrative gets brought in there is asking questions like, “Well when is Bob kicking butt? When is he at his best and how do we get him to do more of that stuff?” Because generally that tends to be a really good outcome for everyone. Yeah definitely. We tend to be a big fan of the idea that remote can be just as successful as on-site. The reason for that is that just because a developer is sitting in a chair doesn’t mean you know anything at all about what they’re doing. A developer sitting in a chair staring at a screen surfing Reddit looks exactly like a developer sitting there cranking code. This impulse to get everyone in a room is not necessarily inherently valuable. The real value comes orthogonal to that which is defining things really well.

Remote teams benefit a lot from metrics-driven environments because you can answer for deliverables. If it’s very easy to go, “Look at what the guys over in Europe did before you got up,” and it’s just trivial that’s great. Without a metrics tool that can be very hard to do. If you wake up and you’ve got a team of 20 and they’re all in Europe and you’re in San Francisco and you want to figure out, “What did they accomplish today?” That’s actually just a hard thing. It’s a very simple question, it should be very trivial to answer but it’s not because the tool set that allows you to get a feel for that, the radar doesn’t really exist. So that’s the value more than anything else that remote teams experience from data-driven stuff. It just gets incorporated into the workflow and I think remote teams and on-site teams benefit about equally from having less interruptions and having data be the thing that drives everybody’s feeling of whether or not progress is happening.

Often times remote teams run into this challenge which is like, “I just can’t tell if they’re working.” I think that’s what everyone worries about with remote teams. Are they doing anything over there? If you’ve got a set of data that says, “Yes.” That’s great, that just totally solves that problem.

Kishore:
That also leads us into a pretty interesting view on metrics overall for engineering teams. We talked about engineering impact in general. There’s a lot of measurable business objectives or OKRs that the teams own from larger company goals. I want to understand more from these different roles how metrics play a role.

How does the VP of engineering see that metric versus HR or a business stakeholder that’s basically done a revenue number on it?

Travis:
Currently we serve engineering and so our view of the appropriate consumer for data about engineering is somebody within engineering. What we’d ideally like to see is that the VP of engineering, if HR is brought into the equation it’s because the VP of engineering brought them in. Not because they’re self-serving on a bunch of data. What we like to see is engineering-led discussions about engineering data. That tends to be a pretty healthy pattern. I mean it doesn’t even matter if it’s engineering-centric. You don’t really want your HR team going in there and staring at your sales data, otherwise they’ll start to, I don’t know, think nasty thoughts about the week or something. I don’t know. There are some great HR use cases but in our view that stuff tends to be very big-picture strategic stuff and not tactical metrics.

So the ideal way to present that is more in the boardroom context where someone from engineering has gotten together a bunch of very useful data, put that stuff together in a consumable slide deck and then it’s shared with the rest of the org. As opposed to somebody from outside engineering self-serving data about engineering.

Kishore:
Right, that’s a pretty powerful way to take some of these engineering metrics into a more strategic viewpoint when generating say factual and reasonable arguments.

Travis:
Definitely. We have this impact metric which you’re referring to which we have developed that takes a lot of different stuff from code-based into account and what we’re trying to … We’re measuring there is it’s a proxy for movement in the code base. How much code are we pushing around? Not just lines of code but are we editing a lot of different stuff here? Are these big broad changes? Are we hitting fundamental parts of the code base here? So when we look at that and we look at the overall activity of engineering we can use that as a bellwether metric to see how successful engineering is in moving the code base forward. Just in bulk. So using that as a bellwether metric is very effective. The way that that’s used typically is people will go in and advocate again for a particular tool chain and say, “Look, we need to bring in Circle CI and whatever else.

“We need to bring in this entire tool chain and if we do that we’re going to get a massive bump in our ability to ship stuff.” If you go in there and you measure something like engineering impact and show, “This is before, this is after,” typically what we see is when people have the ability to measure engineering work in aggregate and take measured steps to improve it we typically see about a 20% boost in engineering productivity over the first six months. It doesn’t really matter how you measure that boost. You could measure it in less employee attrition. In general if you target something very specific the effects of that are spread across whatever KPIs you may be looking to improve.

Kishore:
Have you seen engineering, especially with the impact metric in general, going in to cause a strategic change in direction be it technology evaluation or change in product or specifically anything that has to do with how the company looks at revenue generating value?

Travis:
Well we typically stay away from things like code complexity metrics. Our focus is really on the people. What we want to do is provide a deeper view into the team as a group of people moving work through the engineering pipeline. So it’s a little bit of automation around using this as a radar tool. Like where should I be focusing my attention? We don’t tend to serve up a ton of automated alerts because that stuff becomes noisy really fast. Instead what we do is give people a lot of data at their fingertips to answer questions as they come up. Like, how’s my team doing? Where can I weigh in today? We view ourselves as a tool for amplifying engineering leaders as opposed to we don’t really want to automate away management. We want to empower it and help it do things that are good.

Kishore:
That sounds like an important company-level or strategic metric that can easily be influenced by these engineering metrics. So thanks for sharing that example. You mentioned something about the systems or the data sources that you collect the information from. I’ve seen a lot of automation done as part of the software development pipeline itself. A lot of the process and release management efforts that are being made in this point in time is to actually remove human decision-making from the more day-to-day repetitive tasks.

Where exactly is automation coming into the picture here as you make sense out of all this data?

Travis:
Our goal with regard to the automation piece, our goal is less to automate away a particular part of engineering or function there and more to give engineering leaders power tools to do their job better, to make it easier to gather the data that they want and to generally be good.

Kishore:
Right, thanks. I think we’ve covered a pretty good amount of metrics, different data sources, how small, large, medium-sized engineering teams can actually use more and more of the impact influencers and tactical and strategic decisions both for the engineering teams as well as for their organizations. Do you have any best practices for both engineering and business teams where they have actually started measuring value or improving efficiency? I actually look at the word effectiveness more importantly than just efficiency. Do you have any such best practices just in general from your learnings?

Travis:
Yeah. Again I think the biggest thing is to start small. There’s that old joke, how do you eat an elephant? One bite at a time. Over and over what we see is people who focus very narrow on something to improve first, especially when you introduce a bunch of data. That’s much, much better than getting kind of distracted and focusing on five things at once. Again it’s important to keep that compassion piece and remember that if you’re saying, “The team as a whole needs to improve across this one metric,” that’s a thing that people need to pay attention to on top of their job which is shipping deliverables and writing software and all that stuff. You really can’t expect people … A manager cannot expect an entire team to track all the stuff that they’re tracking. It’s just not fair. It’s the team lead’s job to track all that stuff.

A lot of times the individual contributors on the team, they’ve got their own job. So really you want to keep all this change stuff as lightweight as possible on the team and the way that we see people doing that well is narrow focus. Do it iteratively. Have a cycle where you focus on one thing, get your pattern going there, focus on the next thing. Don’t create more overhead for your team.

Kishore:
I see. I think we got some good best practices here Travis. I wanted to just run through the goals here and see if we actually want to actually add anything else that I haven’t covered here or if from your experience you want to actually guide some of the other things to look at. We did discuss the myth that engineering productivity and value is an art. We actually looked at how some of these challenges that the teams run into in delivery pipelines and measuring feedback loops can actually be driven through very hardcore metrics that are systems process and tools-driven. Also you covered the engineering productivity and metrics in general for different sizes of teams and how KPIs can help evolve as the team grows into different areas within an organization and some of the best practices for the leaders involved in improving effectiveness of all of these processes that are being put in place. Is there anything specific to the software delivery that you want to add here?

Travis:
I mean not really. We covered a lot of ground there. I think we did a pretty good job.

Kishore:
Awesome. Thanks a lot for that. I think there’s a lot of current and evolving trends that we saw here and I’m really looking at putting in some links for the audience here just so that they can see some of the demos also of how the metrics get dashboarded, the graphs that can actually show some aggregate value here and make insightful decisions helpful.

Travis:
Definitely. In fact Ben probably has some really good resources for you. You should follow up with him.

Kishore:
Just for the audience Ben is also a co-founder in GitPrime where Travis works. What’s your go-to reference for all things engineering where you go for learning more about what kind of data should you dig in and start figuring things out in more evolving trends?

Travis:
I mean shameless plug here, I actually get a lot of that from our newsletter which is pretty good. If you want to check on it we do not shamelessly plug ourselves or anything. We roll up industry trend data and then typically our article’s from other people or whatever’s hot at the moment, ship that in a newsletter. I read that thing religiously. Other than that it’s a lot of the same thing that everyone is doing. I’m just looking at Hacker News and that kind of stuff.

Kishore:
I’ve put the link to the newsletter here. I know that you blog on GitPrime. Is there any other way the audience can reach out to you?

Travis:
Yeah. The best way to do that is if you want a little tour we’re happy to do that stuff. You can grab a demo on our site, we’ll provide a bunch of links here and then I’m always on Twitter. If you want to hear me directly I’m @traviskimmel on Twitter.

Kishore:
I’ll put your handle there. Thanks a lot Travis for sharing this insightful journey with us on how you’ve learned basic engineering metrics and the art of measuring it and improving upon this. I’m sure everyone has learned a lot in this particular show. So thank you.

Travis:
Absolutely. It was great talking with you. Thanks for your time.

Kishore:
Thank you. For Software Engineering Radio this has been Kishore Bhatia. Thanks.

Thanks for listening to SE Radio, an educational program brought to you by IEEE Software Magazine. For more about the podcast including other episodes visit our website at SE-radio.net. To provide feedback, you can comment on each episode on the website or reach us on LinkedIn, Facebook, Twitter or through our Slack channel at SERadio.slack.com. You can also email us at team@se-radio.net. This and all other episodes of SE Radio is licensed under creative commons license 2.5. Thanks for listening.

 


 

Brook Perry

Brook Perry

Brook is a Marketing Manager at GitPrime. Follow @brookperry_ on Twitter.

Get Engineering Impact: the weekly newsletter for managers of software teams

Keep current with trends in engineering leadership, productivity, culture, and scaling development teams.

Get the Guide on Data-Driven Engineering Leadership

Today, gut feelings in engineering are being replaced with data. By analyzing over 7 million commits from over 85,000 professional engineers, we share how you can incorporate concrete metrics to guide engineering productivity.

Success! Please check your email for your download. You might also be interested in Engineering Impact: the Weekly Newsletter for Managers of Software Teams. Keep current with trends in engineering leadership, productivity, culture and scaling development teams.

Share This