yeah, that one stood out to me too. i would be tempted to make the stronger statement that it is always strictly better to have your private code unit tested, and it is only the limitations of languages and test frameworks that make people reach for ways to test them through the public API.
Allow me to even more strongly disagree. I've not yet seen a situation where strongly-tested private code would have helped, although I routinely come across cases where too many tests get directly in the way of a refactor or bugfix that _would_ improve the program.
These days, I'm of the opinion that you should test at the highest level possible (if testing a web service, for example, at the level of the actual public API to the service), thoroughly define the behaviours expected for those APIs, and leave the insides of the service entirely untested, unless you have _very_ good reason to. (Critical code, payment code, hard-to-test code, etc.) This ensures you have the important behaviours tested, but leaves you free to modify implementation details as necessary without needing to alter or rewrite entire swathes of test code that suddenly starts failing because you decided to break up or consolidate an inner method.
contrariwise, i have never come across a case where having tests for private functions/methods has hindered refactoring or bugfixing. that feels more like undue coupling than undue testing to me. if i have to alter or delete swaths of code i just delete the associated tests as well, and write new tests for the new implementation.
perhaps relatedly, i like to build my code from the bottom up, and make sure each layer is solid so that i can use it to construct the next layer. there is often not a public api at all until i am well into the project, so for me that would involve writing fair bit of code with no tests at all simply because it was all private code.
i do use the "public methods on private helper objects" pattern a lot, but there is a fair mix of times where it's the best way to write the code and times where it's just to keep the test framework happy.
Too many unit tests can slow down refactoring. But it can still be worthwhile.
Eg: I’ve done some deep algorithmic work on CRDTs lately and my code has a lot of internal parts which are all quite fiddly to implement correctly. For example, I’m using a custom btree with some unusual features as an internal data structure. Btrees are famously difficult to implement correctly, so my btree has both unit tests and fuzz tests to make sure it does what I expect. Having that test suite makes integration tests easier to triage, since I know any failures in my integration tests probably don’t come from the btree. And some btree code paths are probably rarely or never executed by my integration tests. But I still want that code to be correct. Testing it in isolation is the best way.
They’re a lot of other small pieces that I test like this - like saving & loading code, my graph traversal utility code (for comparing versions semantically) and so on. Low level unit testing can find bugs in utility modules while you write the module itself, not later (when you start using the module in some other code).
Agree. The core downside to this approach is pinpointing failure points. A large integration test failing can have many causes. In the same vein, technically, detailed unit tests can immediately discover these. However, in reality, having such a tight, well defined, correct web of unit tests such that any failure can be immediately traced to individual methods is unlikely and an immense amount of work.
An isolated failing integration test implies a better unit test could exist somewhere. But as you point out, you still probably came out ahead by not writing _all_ the possible unit tests you could have.
I go even further - for the browser level tests, avoid using “testid” or css classes or anything that is “implementation”, but rely in your test on solely things that the user can read / interact with.
So don’t “Press button with id “generate”, but the button that says “save” inside the content element titled “generation”.
This way any refactoring work would not require test changes (as it should) and any change of the visible ui / workflow to the user would require an adjustment to the test.
This is a style that I learned from ruby’s integration testing framework “capybara”, and have been replicating it wherever I can since.
A nice bonus is that if you switch rendering technologies, you can reuse the tests (like react native for example).
But on the other hand: if it's worthwhile testing your private methods, shouldn't they maybe be refactored to a public/package-private scope, so they can be reused?
If you disagree, maybe an illustrative example would help. I couldn't think of one where I want to test a private method in detail that is not worth exposing.
A common case for this is very simple APIs with very complex internals that e.g. dynamically switch behaviors or algorithms in certain contexts. The only way to test all the major code paths through the simple API is to hardcode implementation detail into your unit tests to induce all the various switching contexts indirectly. It makes unit tests quite brittle. At the same time, these internals are definitely not public APIs and unusable as such.
> The only way to test all the major code paths through the simple API is to hardcode implementation detail into your unit tests to induce all the various switching contexts indirectly.
Is the "only way" not property-based testing (and maybe fuzz testing)? If that doesn't get you there, it is likely that the API is poorly designed.
You are assuming the function being tested is more trivial than it actually is. Often these are too small (code-wise) to even have private APIs, the internal behavior of the function is simply and necessarily too complex to unit test in the traditional way.
Some function behavior is intentionally and intrinsically tied to temporal access patterns in the API usage. Unless you can simulate a broad cross-section of real-world runtime API access timing patterns with your testing framework, you won't test all of the code paths. You often see this test problem with scheduler-like functions, where the correct function behavior varies based on internal resource pressures that are an interaction between temporal access patterns of the API and the runtime environment. It is a single function, self-contained, and quite simple, maybe not more than 100 LoC, but test environments are so sterile that usually only a single code path is actually used no matter what you throw at the API.
Some functions have critical code paths that dynamically switch strategies to mitigate when certain types of contention or resource starvation conditions are detected internally. These can be extremely rare cases across the set of possible inputs, such that fuzzing is unlikely to trigger them, or the set of inputs that can trigger them is dependent on exogenous environmental details e.g. the machine where the test is run.
As a fun side-effect, sometimes these functions do not have deterministic result. Getting the same result out of the function all the time does not imply correctness, you also have to know why you got the result you did.
All of the above does not apply to writing business logic in Java or similar. But for high-performance and/or high-reliability systems software, these cases come up often enough that it is a well-understood testing problem. Even if you expose all of the internal implementation detail to the unit tests it is not always possible to reliably trigger all the code paths from an arbitrary test absent purpose-built test tooling.
This is always the theory: "well pull that out into a separate class with a public API". But then those classes end up only being used in one place (the class it was extracted from), with a totally unnecessary increase in public API surface area. Both the extra verbosity and the pointless increase in API surface area are worse than just writing unit tests against private functions IMO.
here's an example - i'm currently working through porting the blossom algorithm [http://jorisvr.nl/article/maximum-matching] to elixir. the algorithm is complex and has a lot of subparts, but the public interface to it is extremely simple. i feel like it's not just useful but almost essential to test a lot of the little parts in isolation so i can make sure they are working correctly.
i am currently marking all functions as public for simplicity, but i feel like it's a failing of elixir and/or its test framework that i cannot say "private, but an associated test module should be able to see them", and once the code is done i will be exploring some of the third-party solutions people have come up with to hack around the issue. (the other option, of course, is to have a new module that just contains a couple of public methods, and regard the entire implementation module as a private module with public functions)
Go comes with this kind of “public” and “private” separation. Only the private tests can access private functions. It also serves to differentiate which tests are for API documentation, which the API must conform to forevermore, and which were used to help with development and can be considered throwaway.
I'm not sure why all testing frameworks don't have this.
A lot of the time _private_ methods on one class end up making more sense as public methods on a helper class of some kind (or public methods on other classes).
One issue that leads to private methods in Java is the lack of any way to extend the set of operations on built-in classes: a problem that Kotlin and C# solve with extension methods.
Extension methods are basically just static methods in a helper class that takes the to be extended class as their first argument, and a bit of syntactic sugar on top. That is absolutely imitable in Java.