When we were just getting started on Ada, we didn't write any tests. I came from an agency background where we concentrated on the first release of the product. As far as I remember, our clients didn't want us to "waste time" writing tests. The codebase that became Ada as it is today was the third full rewrite of the product so we were still in a mode of writing code quickly, testing it out with customers, and then iterating or throwing it away. Given how likely we were to throw the code away, there was no point in writing tests.
But in hindsight, there was a moment where we knew we were on to something and we should have switched modes from "throwaway code" to "production code". By the time we on-boarded our third engineer, we should have had established patterns for how to write tests and how to write code in our codebase. Instead, I had to figure this out on the fly while on-boarding engineers and doing my own product development tasks.
Knowing nothing about testing code and learning as much as I could, we decided to follow TDD1. It was really hard. We were used to working quickly and experimentally, but TDD required us to understand the solution we needed before we could write the tests. It didn't support our workflow so we quickly abandoned it. The problem was that the resources on TDD we followed encouraged writing unit tests. Not knowing any better, we kept writing unit tests even after we abandoned TDD.
The result was thousands of little tests that tested the contracts between the parts of our system, but not the behaviour of the system as a user would feel it. This made changing the system hard (changes would often break the tests) and it made the test suite take a super long time to run.
The other mistake we made was writing tests that expected a production-like database to be available. Each test would read, write, and delete data. So, we were testing some business logic, but we were also testing the database. Dumb. If developers didn't know this, they could get some really confusing side-effects within the results of their test's queries as we only reset the database at the end of each full suite run to save on time.
Now, we're a much more mature team and I've learned a lot about testing. Our test suite still has some of those unit tests kicking around, but we've since added lots of integration tests that test the behaviour of the system as a user experiences it. These give us much more confidence in the system and are less likely to have to be changed even as the system underneath is refactored significantly.
Learnings
- TDD didn't work for us and still wouldn't work for us. I'm not sure when it's the right methodology.
- The most valuable tests we have are integration tests that test the behaviour of the system as a user experiences it. These tests rarely have to be changed even as the system underneath is refactored. This principle applies to good LLM evals too.
- You might not need as many tests as you think. I'd trade 1,000 unit tests for 100 integration tests any day.
- Use mocking and dependency injection to remove dependencies on external services like databases and APIs.
- Use unit tests sparingly to test functions that have thorny logic.
- Keep your test suite fast to run, especially in CI.