I worked at a start-up for three years (from 2013 through to 2016) building software as a remote developer. Working with the team from the comfort of my home, we were able to frequently ship production software to a Fortune 200 customer.
Our organizational structure was about as flat as it can get. We had business-orientated people within the team; their roles were to secure customers, understand the behaviors required from software, and translate that to us, the development team. The developers’ job was to build, deliver, and support software to meet those behavioral requirements. On any given day, a developer might be responsible for a large variety of tasks: helping the team clarify the software requirements, writing code, designing and executing tests, automating the deployment of the software, managing the AWS server instances, fixing problems, helping the customer understand how to interact with our software, designing user interfaces, and so on… pretty much anything and everything technical. Today this is often referred to as DevOps.
We had no managers, and it worked well; we were self-organizing teams in the true sense of the word. The software we produced is still in production, and handling millions of events per day. It has been very stable.
Get TestRail FREE for 30 days!
How We Worked
After a few hiccoughs pinning down a technology stack, we settled on the Clojure programming language, even though most of us didn’t know it.
Every day, we pair-programmed with each other remotely, part of the parcel of accepting the gig. I firmly believe we wouldn’t have survived as a distributed company without the pairing. It brought us together as a team, kept us from going stir-crazy, and minimized the distraction of people and shiny things in our home.
More importantly, the pairing helped us learn the new technologies much faster. It helped me learn Emacs, a painful, finger-twisting-but-powerful programmer’s editing tool. It helped me learn Clojure at the same time as a dozen other new technologies. In fact, were it not for pairing, I would not have survived the trial-by-fire that was our large suite of Brand New Technologies (well, brand-new to me at least). Success!
TDD and Clojure
The pairing also helped us do the right programming things, such as picking the right design choices and adhering to good practices, most of the time. For better or worse, TDD (test-driven development) is primarily how we chose to control the code. Most of us felt that TDD was the right thing: We wanted to produce a high-quality codebase, and we wanted to keep being able to change it without worrying about breaking the things that were already working.
Thankfully, it turns out that Clojure is easy to test-drive. Being a functional language, side-effects are a thing to be avoided in Clojure. That makes test-driving Clojure easier: In most cases it’s a simple question of calling a function with data structured in some certain way, and seeing what data gets returned from the other end.
We also wanted to know just what some seemingly obscure Clojure incantation was doing. Clojure can be a “tight” language, allowing you to transform data in a couple handfuls of words (and an even number of parenthesis characters). The novice among us would look at the code and struggle to imagine what was going on. The tests helped our imagination, by documenting both the shape of the data coming into the function, and the changed shape of the data coming out of the function. Even after I got reasonably good at reading and understanding Clojure code, having the tests to document existing behaviors and choices in the system became extremely valuable.
My memory might be faulty, but I don’t recall personally shipping any unit-level defects during that time. We didn’t make many dumb logic mistakes, the bane of most typical systems: things like off-by-one errors, confusing and incorrect conditionals, and code that’s just flat out wrong.
That’s not to say we didn’t create defects; we had more than enough. But these were defects of a different class. Some of them were the result of misunderstandings between the business and the development team. Sometimes we forgot to implement some necessary elements. Some were dumb integration defects. For example: One service returned its data using camel-case-coded JSON names, while the other was expecting snake case. Sigh.
Some defects were the result of not pinning down exactly how the system should behave from end-to-end. The event processing system contained a good number of steps for any given flow. We would ship a release, and find out a few days later that things that should have triggered were not. We had configured application alerts using NewRelic, but it took a while to tweak those to the point where they alerted us at the right times.
Major lesson learned
If someone tells you that “we have enough volume to compensate for small errors,” listen for only so long. While there’s some validity in that stance, don’t let it become an excuse to ship code with major defects that you can’t verify until it’s in production. We spent days in several cases cleaning up the mess created by millions of errant events.
I’ll share some of the blame; I pushed for some end-to-end “acceptance” tests, but probably not hard enough. We were able to create several tests for a couple of subsystems. These tests helped the business and the technical staff agree on how the features should work, and helped point out a couple problem spots. But the tests were flaky, due to the asynchronous nature of the event processing.
I believe that if we had pressed for more acceptance tests, we might have saved ourselves from some of the larger messes we had to clean up. TDD is a great tool, but I view it as just that: one tool in a toolbox. TDD can help you eliminate an entire class of defects: logic defects can disappear as a result.
We must consider also other classes of problems, and embrace the kinds of testing that will help prevent them:
- Scaling issues- things that arise only with very large volume: generative tests / Monte Carlo testing.
- End-to-end issues- delivered functionality that works differently than expected: Behavior-Driven Development.
- Configuration issues / integration issues- things that are broken when we integrate them with other things: Behavior-Driven Development, integration tests.
- Bizarre stuff that no one ever though could happen- things that you would never think to automate: exploratory (manual) tests. You might consider the primary goal of automation in tests as freeing up more time for exploratory testing.
While I found that TDD can help produce successful software, a successful testing strategy demands a balanced approach. Use the testing pyramid as a guide on where to spend your testing dollars. Make sure you’ve built a pyramid with enough stories in it!
Article written by Jeff Langr. Jeff has spent more than half a 35-year career successfully building and delivering software using agile methods and techniques. He’s also helped countless other development teams do the same by coaching and training through his company, Langr Software Solutions, Inc.
In addition to being a contributor to Uncle Bob’s book Clean Code, Jeff is the author of five books on software development:
- Modern C++ Programming With Test-Driven Development
- Pragmatic Unit Testing
- Agile in a Flash (with Tim Ottinger)
- Agile Java
- Essential Java Style
- He is also on the technical advisory board for the Pragmatic Bookshelf.
Jeff resides in Colorado Springs, Colorado, US.
- Announcing TestRail 5.5 Release with Ranorex Integration, GDPR, Admin, UI and Performance Enhancements