Be Suspicious of Success
Successful software is buggy software.
From Leslie Lamport’s Specifying Systems:
You should be suspicious if [the model checker] does not find a violation of a liveness property… you should also be suspicious if [it] finds no errors when checking safety properties.
This is specifically in the context of model-checking a formal specification, but it’s a widely applicable software principle. It’s not enough for a program to work, it has to work for the right reasons. Code working for the wrong reasons is code that’s going to break when you least expect it. And since “correct for right reasons” is a much narrower target than “correct for any possible reason”, we can’t assume our first success is actually our intended success.
Hence, BSOS: Be Suspicious of Success.
Some useful BSOS practices
The standard way of dealing with BSOS is verification. Tests, static checks, model checking, etc. We get more confident in our code if our verifications succeed. But then we also have to be suspicious of that success, too! How do I know whether my tests are passing because they’re properly testing correct code or because they’re failing to test incorrect code?
This is why test-driven development gurus tell people to write a failing test first. Then at least we know the tests are doing something (even if they still might not be testing what they want).
The other limit of verification is that it can’t tell us why something succeeds. Mainstream verification methods are good at explaining why things fail— expected vs actual test output, type mismatches, specification error traces. Success isn’t as “information-rich” as failure. How do you distinguish a faithful implementation of is_collatz_counterexample
from return false
?
A broader technique I follow is make it work, make it break. If code is working for the right reasons, I should be able to predict how to break it. This can be either a change in the runtime (this will livelock if we 10x the number of connections), or a change to the code itself (commenting out this line will cause property X to fail). 1 If the code still works even after the change, my model of the code is wrong and it was succeeding for the wrong reasons.
Happy and Sad Paths
A related topic (possibly subset?) is “happy and sad paths”. The happy path of your code is the behavior when everything’s going right: correct inputs, preconditions are satisfied, the data sources are present, etc. The sad path is all of the code that handles things going wrong. Retry mechanisms, insufficient user authority, database constraint violation, etc. In most software, the code supporting the sad paths dwarfs the code in the happy path.
BSOS says that I can’t just show code works in the happy path, I also need to check it works in the sad path.
BSOS also says that I have to be suspicious when the sad path works properly, too.
Say I add a retry mechanism to my code to handle the failure mode of timeouts. I test the code and it works. Did the retry code actually run? Did it run regardless of the original response? Is it really doing exponential backoff? Will stop after the maximum retry limit? Is the sad path code after the maximum retry limit working properly?
One paper found that 35% of catastrophic distributed system failures were caused by “trivial mistakes in error handlers” (pg 9). These were in mature, battle-hardened programs. Be suspicious of success. Be more suspicious of sad path success.
Blog Rec
This week’s blog rec is Red Blob Games While primarily about computer game programming, the meat of the content is beautiful, interactive guides to general CS algorithms. Some highlights:
- Introduction to the A* Algorithm was really illuminating when I was a baby programmer.
- I’m sure this overview of noise functions will be useful to me someday. Maybe for test data generation?
- If you’re also an explainer type he has a lot of great stuff on his process and his little tricks to make things more understandable.
(I don’t think his rss feed covers new interactive articles, only the blog specifically.)
If you’re reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers*, is now in early access! Get it here.*