Integrating Integration Tests

This is a guest post by Peter G Walen.

Why is it that so many organizations look at integration testing as a collection of single-threaded, end-to-end tests? There is loads of evidence showing that systems are complex entities and the people working with them are more complex. How can we do integration testing better than we are?

I think many organizations are doing some fundamental things very wrong. Working with teams on testing issues, I see time and again issues that remind me of projects I was involved in 20 years ago. I see people with interesting solutions to complex testing problems who are excited to have found a potential way to address it. Then someone steps in and decrees the “radical ideas” to not be a “best practice.” Organizational issues aside, I see this manifest itself in testing policy and concepts that should have gone the way of the Dodo long ago. Why do I think that? Simple. They did not work.

Let’s look at one example, the end-to-end integration test.

Modern Test Case Management Software for QA and Development Teams

End-to-End Integration Tests

The point of an integration test is to exercise software and see how it interacts with other software. In complex projects, there can be many pieces of software from multiple workflows all interacting. They touch the same general areas, look at the same data, and basic information, and act on it.

For many organizations, the “best practice” solution is a series of end-to-end tests for each workflow. This makes a great deal of sense and I support that idea. Having said that, I see a huge hole in this idea.

The challenge

Don’t get me wrong. I like end-to-end tests. They can tell us a great deal about the application we’re interested in. The problem I see is tests that are scripted, sterile exercises, do not exercise how the software interacts with other parts of the system in a realistic manner. It does not test the integrations, the touchpoints.

We can see how the system handles these various threads. We can measure the accuracy of the data and processing speed. We can measure cycles and throughput. We can check the speed and nature of database calls. In some instances, we can even measure the ease of human interaction with the screens people need to work with.

But, if organizations run end-to-end tests one at a time in isolation, it is possible that important conditions will not get exercised. In some environments, it is very likely these conditions will not get exercised.

Is there a solution?

Rather than relying on a collection of single-thread end-to-end tests to look at the touchpoints, the integration points of the software — treat them as a system. If that seems too hard or too complex, let’s look at how we might break this down to be less complex, starting with the test data.

The Data Problem

I have seen many companies consider “production data” to be the gold standard for what the data looks like. Fine. That seems reasonable. Does your test environment have the capacity to host a full replication of the production database or file systems? Maybe. In the companies where I have worked or provided consulting or contract test services, I have seen a total of one organization with that capacity. One.

Unless your company can do that completely you will need a reasonable subset of data to work from. The challenge here should be obvious. In 1983, it would have been relatively easy to run a file system extract and create test data environments based on production data. The work would be extensive, but not all that hard.

Today, you might be able to model what you need for each situation. Get in touch with the subject matter experts from the systems your application will interact with. They can help make sure the correct data relationships are built. Of course, the systems that provide the data THEY need will also need to be evaluated to build realistic data relationships. There is one great benefit I have seen with this approach. The data built for testing will have a far broader variety in values, conditions, and variables than grabbing a subset of production data.

In a transaction-based sales app, there often are mountains of records to choose from which are essentially the same scenario or condition. It is the deeper variations that are encountered occasionally that likely will result in problems or late-night phone calls. Orthogonal arrays that help track the combinations of data values needed for each situation can be of benefit here.

It is time-consuming, but once finished, you now have a model for the possible combinations of data values you need for testing, and how to get them.

The People Problem

There are issues, of course, with taking this approach to the data problem. Even when someone is tasked, charged, directed or voluntold to work on something — if they see little to no gain for themselves, it is likely not going to be near their top priority, nor get their best effort.

In a high-functioning shop, people are willing to jump in and help. Sometimes. At least, if the benefit to them and their team outweighs the plain hard work of sorting through the various scenarios.

Then there are the adherents to “We’ll just use production data.” If the discussion about the reliability of data from production to model all test data does not persuade them, there is another argument that might help.

Many organizations and teams still hold to the “last person to touch the code” as being the one responsible to make any corrections for any problems in production. I first encountered this in 1982. When I was developing code, even a few years ago, team members were advocating the same approach. This seems to be perfectly reasonable in some circumstances.

Testing significant changes without careful consideration of the data and considering the myriad permutations and combinations it can take, then the odds of encountering situations not tested will very likely increase.

What if the problem is uncovered six or eight months down the road? If you were the last person to touch the code that failed, perhaps a small amount of work now, while the scenarios are fresh in your mind and there is a team to help support you, is preferable to trying to rush a fix late some night while the same team is sleeping.

The Flow Problem

This brings us to the central problem. Using end-to-end tests as integration testing is indeed a good idea. To me, however, it is only a first step. Once the end-to-end test (or collection of tests) runs successfully, then, if appropriate to your context, start the processes again. This time launch them in such a way as to mimic the expected demand and expectations of the systems involved.

The more “life-like” you can make integration testing, the better information you will have on the behavior of the software. It is important to remember that not only does the “changed” or “new” software need to be tested, but the existing software could also possibly be impacted by the changes.

I have seen many instances where unexpected problems appear in systems or areas that were unchanged or untouched. I have seen many instances where this approach discovers them before they occur in production.

If you use modeled data, selected and designed to mimic likely and possible data combinations, in conjunction with activity as close to the “real world” as you can get in a test environment, the better understanding you will have as to the behavior of the system.

With this understanding, you can make a more informed report to the project stakeholders on what was exercised, the vulnerabilities discovered, and better suitability to the purpose of the software system.


Peter G. Walen has over 25 years of experience in software development, testing, and agile practices. He works hard to help teams understand how their software works and interacts with other software and the people using it. He is a member of the Agile Alliance, the Scrum Alliance and the American Society for Quality (ASQ) and an active participant in software meetups and frequent conference speaker.

In This Article:

Sign up for our newsletter

Share this article

Other Blogs

General, Agile, Software Quality

How to Identify, Fix, and Prevent Flaky Tests

In the dynamic world of software testing, flaky tests are like unwelcome ghosts in the machine—appearing and disappearing unpredictably and undermining the reliability of your testing suite.  Flaky tests are inconsistent—passing at times and failin...

General, Continuous Delivery

What is Continuous Testing in DevOps? (Strategy + Tools)

Continuous delivery is necessary for agile development, and that cannot happen without having continuity in testing practices, too.Continuous delivery is necessary for agile development, and that cannot happen without having continuity in testing practices,...

General, Business, Software Quality

DevOps Testing Culture: Top 5 Mistakes to Avoid When Building Quality Throughout the SDLC

Building QA into your SDLC is key to delivering quality. Here are the mistakes to avoid when building quality throughout the SDLC.