Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> For this reason, formal proofs of correctness are valuable for distributed algorithms.

I have a massive amount of respect for Martin and his work, but I think that this emphasis is the wrong one (if our goal is to increase the correctness of deployed distributed algorithms).

Instead, I like to think (building off work of folks like Ankush Desai, an AWS colleague and creator of P) that the process of specification, and it's products, are more valuable than the creation of a proof. Model checking - either exhaustive or stochastic - is valuable for checking the properties of specifications, but pushing the last step into full proof is often not worth the additional effort. Instead, I think making proof the focus and goal tends to turn most programmers off the whole topic.

I think we should be saying to folks "If you pick up these tools you'll end up with a crisp specification that'll help you move faster at development time, extremely clear documentation of your protocol, and a deeper understanding of what your protocol does" instead of "if you pick up these tools you'll end up with the mathematical artifact of a proof". This has been a big shift in my own thinking. When I first picked up Promela, Alloy, and TLA+/TLC/TLAPS, my own focus was on proof (and remained so until 2014 ish, when I spent a lot of time talking to Chris Newcombe and Leslie Lamport as we were writing the "Formal Methods at AWS" CACM paper). Over time I've found the journey much more valuable than the artifact, and tools like PlusCal and P which make specification more approachable more valuable than tools with less familiar syntax and semantics.

We shouldn't be surprised that few people can make progress when we ask them to understand both Paxos and Isabelle at the same time! Maybe if that's your focus as a PhD student, then it's OK.

> The real challenge in verifying distributed algorithms is to come up with the right invariant that is both true and also implies the properties you want your algorithm to have. Unfortunately, designing this invariant has to be done manually.

This is true. Invariants are hard. But maybe I'm more optimistic: I think that there are a lot of very real systems with straightforward invariants. Lamport's three invariants for consensus (section 2.1 of Paxos Made Simple), for example: https://lamport.azurewebsites.net/pubs/paxos-simple.pdf Similarly, invariants like Linearizability, and even formal definitions of isolation.

The research I would love to see is work to synthesize invariants from formalisms like Adya's or Crooks' work on database isolation levels. I think that's a very tractable project, and would be super useful to practitioners.



Hi there! Martin here.

I agree that for most engineers, getting into formal proof is intimidatingly hard and perhaps not a good use of time, because lighter-weight specification languages and model-checking can help develop a good understanding of a system's behaviour with a much lower time investment.

But the intended audience for this post was not engineers working on production systems; it was intended for researchers studying distributed algorithms (e.g. consensus algorithms), and researchers who already know proof assistants and want to know how to use their skills to verify distributed algorithms. Note that this post appeared on a blog called "Machine Logic", not "Practical Distributed Systems Engineering"!

I have personally got a lot of value out of formal verification because the distributed algorithms I work on (CRDTs) are sometimes so subtle that without a proof of correctness, I simply don't believe that they are correct. I have several times developed algorithms that I believed to be correct, only later to find out that I was wrong. Here's a case study of one such example: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-969.pdf

For people like myself, who are designing such algorithms, I believe that formal proof is worth the effort. But most people are not in this situation.


Are your focus and Dr. Kleppmann's mutually exclusive? I ask because it seems like the Isabelle formulation emerges from the protocol itself. The invariant is the only added work, yes?

I'm effectively asking if the proof generation could replace the model checking as a way of verifying that you have appropriately specified your protocol. Big information exchange protocols can be confusing!

(Maybe I'm misunderstanding what an invariant is... I'm thinking it's the property proved true by induction (and would go something along the lines of "if and only if the good has been sold, then the merchant has received payment for the good"))




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: