> The reliability estimate does account for the bug. The bug did not cause a cri...

> The reliability estimate does account for the bug. The bug did not cause a critical operational failure.

Yes it did cause operational failure! An airplane turning itself the wrong direction is an outcome, and an extremely serious one.

There was a bug that put people at risk, and you are saying that just because a human caught it and it didn’t crash the plane, it doesn’t count as unreliable?! You’ve just rationalized ignoring all bugs that don’t cause fatal crashes when estimating software reliability. This is making your point weaker, not stronger. You’re arguing that software reliability should only be measured by fatalities. If you really want to go that way, one might conclude that “normal software” like AWS is infinitely more reliable that aviation software, because it never killed anyone. By discounting any bugs that don’t lead to plane crashes, you are undermining your own claim that aviation software is “250,000x” more reliable than other kinds of software.

This kind of analysis- the insistence that reliability is high because death has not occurred often- has played a major role in several high profile accidents. In the shuttle disaster, for one, it was specifically called out that reliability estimates were exaggerated. The Therac-25 incident is another case where engineers failed to understand what happened for a long time due to vastly exaggerated reports of the system’s risks and reliability.

No, uptime still makes zero sense to compare, it is a nonsense metric in this context. Uptime is a measure of continuous operation, and planes aren’t in continuous operation. Simple as that. It’s a metric that does not apply to aircraft, no matter how you spin it.

There are multiple cases of major software failure in military and aviation from systems being in continuous operation for too long. There was a thread just the other day about an airline’s safety procedures specifically requiring in writing a reboot every 30 days due to known bugs.

And you’re ignoring that the 737 MAX did not suffer system operational failure. The system didn’t go down, it kept working. If the system had gone down, those people might have survived. The crash happened precisely because the buggy system kept working. If you want to count the downtime of the system, you maybe ought to count all the flight hours the plane would have flown since the crash, rather than using a bogus concept of only the ratio of fall time to all flight hours to estimate industry reliability. Again, that ratio is completely and utterly meaningless as a proxy for software reliability.

“Downtime” in normal software is not always caused by catastrophic failure, sometimes it’s due to maintenance and upgrades, sometimes it’s due to low performance, sometimes it’s caused by people actively attempting to fix bugs during uptime. None of those things happen during an airplane’s uptime.

> One of the only ways to compare processes and not be tricked by fancy words, especially as a non-expert, is to look and compare actual outcomes.

I’m not arguing against comparing outcomes. I’d agree that looking at outcomes is a good thing, if, and only if, you are actually fair about seeing all outcomes. I’m suggesting that pointing at the more easily verifiable volume of testing effort and safety concern in regards to aviation software, when compared to how much testing and verification happens on ‘normal software’, might adequately persuade someone who didn’t know it that aviation software testing and bugs are taken way more seriously than testing and bugs of web apps are.