> “seems” / “seem” Are you intending to suggest there are no formal proofs since...

ingloriousB · 2025-08-17T15:15:03 1755443703

:) I'm only using "seem" to indicate the limits of my knowledge.

I was just hoping someone would chime in with a link to stronger/formal proof for VSR. Are you aware of any?

So, yes, to my limited knowledge, I've not found any existing formal proofs for VSR.

The 2012 VSR revisited clearly labels their arguments informal; in section 8: "In this section we provide an informal discussion of the correctness of the protocol."

I'd be delighted to learn of any formal / machine checked proofs of VSR ; equivalent to the Verdi project for Raft.

These are elaborate and subtle protocols, easy to get wrong.

In particular when it comes to things like reconfiguration, even Raft had the famous 2016 bug in the simpler of its two reconfiguration protocols.

Note that Verdi did not attempt to verify the reconfiguration protocol; apparently it was too difficult.

There was an attempt earlier this year to give a proven reconfiguration protocol for Raft called Recraft[1], based on an earlier paper called Adore[2]. They discuss why reconfiguration is so difficult to prove. It has to do with circularity.

"ReCraft: Self-Contained Split, Merge, and Membership Change of Raft Protocol" by Kezhi Xiong, Soonwon Moon, Joshua Kang, Bryant Curto, Jieung Kim, Ji-Yong Shin. last revised 28 Apr 2025, v2.

[1] https://arxiv.org/abs/2504.14802

[2] https://dl.acm.org/doi/pdf/10.1145/3519939.3523444

I'm not completely convinced yet that ReCraft works; at one point I thought they assumed away certain scenarios -- but I need to revisit it with a close reading.

At a minimum, reading the Adore paper's discussion of how much subtlety is involved is pretty compelling.

My conclusion is that formal proof is an absolute necessity to have a fighting chance at a correct implementation--especially when it comes to reconfiguration.

jorangreef · 2025-08-17T15:24:06 1755444246

There are at least two formal proofs.

Have you tried Googling for them, instead of creating a throwaway account to comment anonymously here? :)

ingloriousB · 2025-08-17T15:56:58 1755446218

Per the site guidelines[1], please avoid gratuitous negativity.

[1] https://www.ycombinator.com/blog/new-hacker-news-guideline

> https://www.ycombinator.com/blog/new-hacker-news-guideline

Certainly I've googled. I have found no proofs.

Surely by now you would actually exhibit the formal proof if you had one right? (He asks, for the 3rd time).

Note that a TLA+ spec is not a formal proof. Also note that a model checking run is not a formal proof.

I'm still hoping to learn something new about how proven is, say, the reconfiguration or recovery protocols in VSR.

So far I can only conclude that my research has turned up nothing as far a formal proof for VSR. Please show me I'm wrong with a link to one. :)

jorangreef · 2025-08-17T18:12:37 1755454357

I don't mean to trap you, my anonymous friend :) but does the formal proof for Raft in Coq via Verdi not apply—at least in spirit—to the essential view change and SMR protocol for Viewstamped Replication? And, similarly, would you say that Viewstamped Replication's core view change and SMR protocol is really that different from Multi-Paxos as a superset—such that that proof also wouldn’t carry over?

I agree that proofs are sensitive to modeling choices. But the reason I ask is that the literature generally treats the core of these protocols (view change + SMR—reconfiguration aside for now) as essentially equivalent.

For example, I'm sure you're aware of Heidi Howard's work here, which unifies consensus under one framework, the main differences being election style (i.e. Raft does random, VSR does round-robin) and terminology, not fundamental mechanics. The upside being that optimizations and sub-protocols, such as reconfiguration, can then be shared across protocols.

To your point about reconfiguration, reconfiguration sub-protocols are a field in themselves, and here it’s common to mix and match. To be clear, I'm not aware of a proof for the reconfiguration sub-protocol in '12 VRR (and I've found a bug in its client crash recovery sub-protocol—with Dan Ports finding another in its recovery sub-protocol), but again, as Howard notes, since the SMR cores are equivalent, you can adopt a reconfiguration sub-protocol or session sub-protocol that has been proven—at least this is common practice in production systems.

I hope the spirit of the argument is clear. And trust that none of this changes the OP point: that VSR pioneered the field and that Raft (in the authors' own words) is "most notably" similar to Viewstamped Replication.

(Let's not get into the subject of actual implementation correctness, which is orders of magnitude harder than formal design proofs, or the fact that the formal proofs in question still lack for a storage fault model—for example, many Raft implementations violate the findings of “Protocol-Aware Recovery for Consensus-Based Storage”)