GitPrime elevates engineering leadership with objective data. In this interview series, Engineering Leaders talk about how to build high performing teams.
We talked to Matt Brandt about test engineering and quality assurance (or QA). Matt is a senior test engineer with Mozilla, working primarily on web technologies, with a background in both software development and psychology.
He shared some particularly interesting insights about how to keep the user in mind, how to build a test plan, and when to start phasing QA into an organization.
Adding value through QA
Travis: Let’s start with a key question. What’s the difference between QA and test engineering?
Matt: Truthfully, I think the differences are negligible if not non-existent. In the software world finding individuals who are articulate and able to critically vet a software product takes a high level of technical ability. A test team adds value to products by using a variety of tools and techniques to assess the risks associated with delivering a product to users.
Our role is to figure out where we can add the most value on a product at a given moment in it’s development cycle. Sometimes it’s manual testing, other times it’s creating tools that help stress a system in unique ways. The point is we need to always look for how we can add value and find ways to insert ourselves in those areas. Our focus is on narrowing down and helping identify and communicate potential risks to the team.
Travis: How do you figure out the most fruitful ways to add value through QA?
Matt: I think it comes down to how software testers approach the complexity of the project their working on and decompose it into manageable areas within their test strategy. It’s important to break apart complex systems into smaller problem sets that allow us to analyze the components and how they contribute to a hierarchy of potential risks.
For example when I first join a new project I like to take the time to understand the business and technical domain at several levels. I’ll interview the different product and stakeholders to build a picture of what they believe we’re creating and the areas of concern that they are focused on. Developers, business analysts, product managers, etc all have different focal points and concerns when it comes to what a team is creating and what constitutes success. This is a holistic approach of understanding what the product is; what it’s success and failure modes are. This helps build a complete picture of what a test team needs to add to the feedback loop that the group consumes. Specifically this picture identifies risks and informs what we’ll be testing and how we’ll be testing it.
When we talk about mitigating risks, it’s important to understand that the QA process does not ensure that a product never fails. Instead, it ensures that a product fails within tolerable thresholds.
In other words, QA is risk management. We help identify areas of the software users will tolerate failure from and which areas absolutely cannot fail. We then create and maintain tests that provide an appropriate feedback loop to the team.
When things go wrong
Travis: How do you determine which failures are tolerable and which ones aren’t?
Matt: I like to break a project’s components and features into three broad tiers.
The first tier involves areas that can never go to production broken. These are areas of functionality that if we ship to customers – whether that’s a new bug, or we just solved a problem wrong to begin with – we have failed at our job of providing a quality product.
At the second tier I believe are breakages that can go to a production environment broken. The deciding factor here is, what are the SLA considerations and how do they impact the business as well as the user’s level of trust in a product. Features that fall into this category need to be carefully discussed with the team and balanced with how quickly a team can recover. That is how quickly can we spin out a new release and get it out to our customers.
The third tier are areas where it’s alright if the software goes out to production with problems. We’re not concerned about fixing them until a user points the problem out to us – we’ll fix it when we realize we have a problem, often times most users never realized anything was ever wrong.
Travis: Cool. So how do you approach this three-tiered risk classification? What does it look like in action?
Matt: This three-tier outline is how I build a test plan. Let’s look at an example project, a product that collects Firefox crashes.
When Firefox goes down, it gives the user an option to send in a stack trace, which is really valuable on the QA end. It helps us to look across several hundred million users and find out where people are experiencing the most crashes.
Engineering uses this information to figure out what we need to reproduce to riddle out what’s crashing and why. But we also make business decisions with this information. From a business standpoint, it’s actually acceptable for some parts of the reporting interface to go down. It’s up to QA to push for clarity on the prioritization of the problems, triage them, and keep in mind a broad view of what “quality” means.
In this instance, our top priority is never to lose crashes. We want to collect those stack traces. So we designed the system where, sure, the system can go down, but we can guarantee close to 100% up-time on collection of the raw crashes. We know that we are still collecting crashes and can reprocess them. That kind of response translates to users of our system as quality. It’s important to highlight the web front-end of this system can go down, a CSS or JS regression can occur, but we can never lose a raw crash due to downtime.
Travis: It sounds like the way you think of “quality” is not only about how well a product works, but how positively the user views and interacts with it.
Matt: Yes. Quality is about the user’s perceived experience of a product as much as the hidden parts of the product itself. You can have the finest code on the planet, but if the product doesn’t resonate well with users, it’s not a quality app.
I have a background in software engineering and psychology. In the world of test engineering, you need to have both people skills and technical skills. You need people skills to have empathy for your team and for your customers. Technical skills to have the ability to critically vet the solutions that are being created.
After all, we’re all creating a product that we hope will impact people positively. A test engineer needs to add to that team culture, needs to understand complex technical solutions, while also bearing in mind the end user’s experience.
Staffing for QA
Travis: With the importance of QA in mind, I’m curious about the right way to phase it in. A small company with just developers is purely reactive in the beginning. Users are finding problems before you do. At what point do you start thinking about QA in a more proactive sense?
Matt: If you mean having a dedicated test engineer, I think no company is too small for QA. Even a startup. To be more specific any group that is creating a sufficiently complex cobbling together of software, a dedicated person focusing on unearthing questions around quality is a benefit. Testers are a special type of geek. We excel at helping a team ask the right questions and understand risks as they evolve. We formulate plans to mitigate those concerns. We help by testing in a reproducible fashion those assertions that we have a quality product.
At a certain point, a product that’s complex enough needs somebody to create automated tests to check and stress the system. I have yet to be on a project that didn’t benefit from some level of test automation. I find that automation comes as a result of manual exploratory testing or directly as tooling to help do exploratory manual testing.
On the topic of the future of QA, we will likely always automate the repetitive tasks, which opens up room for manual exploratory testing. At this point in time the more high-empathy testing that’s done is by humans.
Travis: You started off by talking about adding value to a product. So to wrap up, what do you think is the value of having a solid QA as part of a development team?
Matt: QA will always help identify questions specific to quality of the product and help translate those to actionable outcomes for the engineering team to act on. We help advocate for the end user and make sure the team understands their needs. That’s reducing risks. That’s increasing the quality of a product.
QA adds value to a team by taking a holistic look at the entire system, and surfacing the evolving risks that change over time.