Sometimes we’re asked how we build new reports or get the data to create them, so I thought I’d share my perspective on the process.
Creating a new report at GitPrime usually starts with customer request that gets fleshed out into a typical user story. Or, alternatively, our CEO Travis says, “Wouldn’t it be cool if we could see from the data…”
My background is in data analytics, not in software engineering. This can be an advantage because I have no preconceived ideas about the data we’re looking, and no way to bias the results we find.
Once we have a report concept, I use several sample repos, then work in RStudio to break, sort, and reconfigure the raw data into files and csv worksheets that allow me to look at work patterns of each engineer. The data stays anonymized, insulated this work from any specifics about the developers themselves. With these files I’ll take averages, percentages, and multiply to piece together what happened in the project over the last six months.
Working with the data (but without the story), I’ll try to recreate what happened and discover key metrics that help me learn about a particular team. Management questions guide the analysis: Are people are rewriting a large percentage of their own code? Is this person writing enough code to keep up with the pace of the project and team?
I use a variety of tools including Excel, RStudio, and good old fashioned whiteboard or pen & paper to try to tease out discoveries, ideas, and problems.
At this stage, I may talk with an engineer on one of the teams in our research betas to learn a bit more about the codebase to see how the data we’ve found matches to their perception of the project’s timeline. This is where we put any discoveries into context, making sure that our hypotheses about the data map to the boots-on-the ground view of actual engineering work.
Questioning our assumptions
At each step, I’ll take a step back and speculate why a particular engineer is either rising above or missing expectations. Have they been given an unclear assignment? Are they nearing the end of a project? Am I oversimplifying the question? These then get refined into algorithms that drive our data visualizations, and highlight if a developer is following a trend or if they are falling into some discernible work pattern.
Often a metric that we’ve focused on measures one aspect of development — commit throughput, net productivity, or work focus — and only tells a partial story. There are nuances to be learned and considered. For example someone that specializes in fixing bugs writes fewer lines per commit, but through the process of looking at how many commits were being written we can see how strong and important that developer really is.
The most challenging aspect of working with data about the development process are the large values that skew the entire data set. What makes this even more challenging is taking into account the context for what constitutes an outlier: this notion is inherently developer, project, and team specific. Is this huge commit a library import, or did that person actually check in something massive that they wrote? This process of looking for the story within the data helps sort the signal from the noise as these anomalous values starts to make sense. This then informs the next question that we want to derive from the data.
Seeing what sticks
After many notes, worksheets, and files I present my findings to the Product leads to see if the story that I was able to tell helps to solve some of the operational challenges we hear about from our customers.
If theres a match, we sketch out presentation ideas on a whiteboard or with a sharpie, aiming for a rough version of an interface that will highlight the most critical data. We’ll typically build a low-fidelity version of this in Balsamiq to evaluate flows, transitions and see how it feels.
If the idea is worth pursuing further, we then create a high-fidelity version of final comps in Sketch, often followed by a realistic clickable prototype in InVision. When that’s ready, we socialize this with some of our key customers for feedback. If it passes the user test, we’ve got a winner! The result will be fully specced, including all report functionality, complex math, visual design, and careful consideration to any edge cases that need to be handled.
Maybe 20% of the ideas we explore ever see the light of day. Some are false paths and get discarded. Others seem like partial concepts waiting to be incorporated into a larger breakthrough. A few are moderate enhancements and added to the backlog. The exceptional ones rise to the top of the queue for an upcoming release. It can be painful to throw away weeks of work, but we know the effort isn’t truly wasted: it’s all part of the process to build the most useful, powerful reporting ever for software teams.
Data Analyst at GitPrime
Get Engineering Impact: the weekly newsletter for managers of software teams
Keep current with trends in engineering leadership, productivity, culture, and scaling development teams.