What do you think about testing in production? Does the mere sight of those words make your blood pressure rise? Maybe it does. For many developers, just conceiving such a thing is what it takes for their minds to start recalling horror tales from their early careers. Remember that time when, as in intern fresh out of school, you executed that update without a “where” clause? Oops.
If you count yourself among the ranks of those who still see testing in production as an unforgivable sin, we’ll try and convince you it isn’t so. When done correctly and under the right circumstances, testing in production can be the final piece of the puzzle of awesome software.
Get TestRail FREE for 30 days!
Testing in Production: Why the Bad Rep?
I’ve mentioned that testing in production has had quite the bad reputation. It can evoke a feeling of lacking professionalism. People associate the practice with the absence—or even ignorance—of the existence of automated tests of any kind, not to mention all sorts of software engineering best practices.
Where does that come from? Why did testing in production acquire such a stigma?
Well, as I’ve said before, some of this bad reputation might be deserved. When done the wrong way, testing in production can be the source of terrible headaches. Data loss is what comes to mind first. But there’s also the loss of revenue due to the unavailability of an application. Imagine Amazon offline for, let’s say, 24 hours. Even worse, human lives could be lost due to a failure in some critical piece of software.
Testing in Production Done Wrong: What Are the Ingredients?
The bad rep that testing in production carries is sometimes justified. When you’re talking about the production environment, the stakes are as high as they can get.
But just saying that isn’t enough. We need to understand what makes the wrong kind of testing in production bad in order to appreciate the niceties of the good kind.
What ingredients can cause testing in production to go wrong?
First on our list is the total absence of any test whatsoever before production. This is one of the factors that many people associate with testing in production. In their minds, testing in production necessarily follows “non-testing in non-production”. That shouldn’t be the case, of course. Your team should employ a solid quality strategy that includes a comprehensive suite of automated tests; especially but not limited to, unit tests, and also manual tests where it makes sense (e.g., usability testing in the form of manual exploratory tests).
Another ingredient of doomed production tests is the total lack of mechanisms to rollback errors when things go wrong. It’s 2018, folks. It’s about time we start taking backups seriously!
Okay, most people nowadays do backup their data regularly. But do they also practice restoring those backups regularly? If the answer is no, then these people are in for a nasty surprise the next time backups are needed but can’t be used.
Third in this list is the lack of a proper monitoring strategy—or even total lack of monitoring, period. From the most humble logging to more advanced approaches, if your team doesn’t do any monitoring, how are you supposed to understand what went wrong and why?
Last but not least, we should mention performing production tests at inappropriate times. You can’t always afford to deploy your changes and perform the tests at, say, 1 a.m. I doubt that Facebook, for instance, being available 24/7 globally really has any time with low-volume. Sometimes you can afford to do this but you don’t want to since there’s value in deploying during rush hour.
Testing in Production: Why Do It?
Up until now, we’ve focused mostly on the bad side of testing in production. The time has come to do a 180 and see the light. Welcome to 2018, testing in production is a thing. You should probably do it. Why?
Often you just have no choice. There are situations where it’s virtually impossible to recreate or simulate a faithful staging environment that’s as close to production as needed. Or maybe it’s possible, but the costs are prohibitive.
Sometimes you just need to gather real usage data. For instance, sometimes you need user feedback for a given change in the UI. Doing hallway usability testing has its value, but it might be not enough. What if what you need is a very large sample of user feedback? In this case, nothing beats the real thing.
Last, but not least: When software crosses a certain complexity threshold, local tests might not be enough. Popular wisdom says “better safe than sorry.” But maybe popular wisdom wasn’t aware of the law of diminishing returns. We can refine prevention techniques all we want, but some bugs will always slip by us. That’s inevitable given the nature of what we do.
Of course, don’t get me wrong. Unit testing is awesome. You should do it and probably more than what you’re currently doing. But don’t fool yourself into thinking it’s going to be the magic solution for all your software problems.
Testing in Production: The Good Ways
Time for a quick summary. Up until now, we’ve covered:
- Reasons why testing in production might have a bad reputation—and why sometimes even rightly so
- The components, or ingredients, that can cause tests in production to go sour
- Some justifications for testing in production and how to make sure it’s done right
Now, we’re going to discuss some of the ways in which testing in productions happens in the wild. This won’t be a tutorial or a detailed how-to, because that would go way beyond the scope of this post. Instead, we’re going to offer a brief overview of two real-world approaches to testing in production.
These approaches are not necessarily interchangeable, though. Think of them more as tools to be aware of. When a need arises, you can reach to your toolbelt and get into action.
You’ve probably heard of A/B testing, a statistical experiment performed by splitting a user base into two groups. The first group, A, receives the current version of the website or app, which we call the control. The second group, B, gets a version of the app that is modified in some respect. This version is called the treatment or variation.
By comparing the behaviors from users in both groups, teams can draw conclusions about the consequences of the rolled-in changes and decide what the next steps should be.
Time for the second item in our list: canary releases. What are those?
A canary release is a method of delivering a new version of software to a subset of the customers, monitoring it closely to see how it behaves. If things go wrong, you can roll it back. Here’s a quote from the Netflix’s engineering team that explains it further:
“A canary release is a technique to reduce the risk from deploying a new version of software into production. A new version of software, referred to as the canary, is deployed to a small subset of users alongside the stable running version. Traffic is split between these two versions such that a portion of incoming requests are diverted to the canary. This approach can quickly uncover any problems with the new version without impacting the majority of users.”
In the case of Netflix specifically, they automated virtually the whole process. By using their tool Kayenta—which they open-sourced in partnership with Google—they are able to monitor how the canary version is doing in production and even automatically roll it back when the results are lower than certain thresholds they established. There are other cases when, instead of automatically rolling the canary back, Kayenta will notify a human judge, who decides what the next step should be.
Test in Production with a Clear Conscience
In our industry, things are constantly changing—and astonishingly fast. What seems inconceivable right now sometimes becomes acceptable just a few years later. The opposite is also true: Sometimes the status quo will descend into ostracism.
The former is exactly what happened with the practice of testing software in the production environment. What not long ago was considered a sign of amateurism and recklessness is increasingly being recognized as a valuable tool to deliver good software and react quickly to changes in the environment and demands from the customers.
Use the ideas presented in this post as a starting point. Don’t stop reading, practicing and trying out new tools and techniques. That’s your ticket to harnessing the power of testing in production in order to get your team and your projects to the next level.
This is a guest post by Erik Dietrich, founder of DaedTech LLC, programmer, architect, IT management consultant, author, and technologist.
Test Automation – Anywhere, Anytime
- TestRail a Leader in the G2 Crowd Grid for Software Testing
- Announcing TestRail 5.5 Release with Ranorex Integration, GDPR, Admin, UI and Performance Enhancements