Hacker Newsnew | past | comments | ask | show | jobs | submit | GermanJablo's commentslogin

Interesting. I'm the author of DocNode, a library that does exactly what you're describing; it might be useful. https://docukit.dev

Cheers!


> But with some optimisation work, CRDTs perform better than OT based systems.

I read your paper and I think this is a mistake. You assume that OT has quadratic complexity because you're considering classic operation-based OT. But OT can be id-based, in which case operations are transformed directly on the document, not on other operations. This is essentially CRDT without the problems of supporting P2P, and therefore the best CRDT will never perform better than the best OT.

> CRDTs let you scale your backend. OT (usually) requires server affinity. CRDT based systems are way more flexible. Personally I'd rather complex code and simpler networking than the other way around.

All productivity apps that use these tools in any way shard by workspace or user, so OT can scale very well.

If you don't scale CRDT that way, by the way, you'd be relying too much on "eventual consistency" instead of "consistency as quickly as possible."

> (For anyone who's read the papers, I'm conflating OT == the old Jupiter-based OT algorithm that's popular in Google Docs and others.)

Similar to what I said before. I think limiting OT to an implementation that’s over three decades old doesn’t do OT justice.


> I think limiting OT to an implementation that’s over three decades old doesn’t do OT justice.

I haven't kept up with the OT literature after a string of papers turned out to "prove correctness" in systems which later turned out to have bugs. And so many of these algorithms have abysmally bad performance. I think I implemented an O(n^4) algorithm once to see if it was correct, but it was so slow that I couldn't even fuzz test it properly.

> You assume that OT has quadratic complexity because you're considering classic operation-based OT. But OT can be id-based, in which case operations are transformed directly on the document, not on other operations.

If you go down that road, we can make systems which are both OT and CRDT based at the same time. Arguably my eg-walker algorithm is exactly this. In eg-walker, we transform operations just like you say - using an in memory document model. And we get most of the benefits of OT - including being able to separately store unadorned document snapshots and historical operation logs.

Eg-walker is only a CRDT in the sense that it uses a grow-only CRDT of operations, shared between peers, to get the full set of operations. The real work is an OT system, that gets run on each peer to materialise the actual document.

> This is essentially CRDT without the problems of supporting P2P, and therefore the best CRDT will never perform better than the best OT.

Citation needed. I've published plenty of benchmarks over the years from real experiments. If you think I'm wrong, do the work and show data.

My contention is that the parts of a CRDT which make them correct in P2P settings don't cost performance. What actually matters for performance is using the right data structures and algorithms.


> Citation needed

It seems to me the burden of proof is on you. You were the one who claimed that “CRDTs perform better than OT-based systems.” I’m simply denying it. My reasoning is that CRDTs require idempotence and commutativity, while OTs do not. What requirement does OT have that CRDT does not? Because if there isn’t one, then by definition your claim can’t be correct. And if there is one, that would be new to me, although I suspect you might be using a very particular definition of OT.


> It seems to me the burden of proof is on you. You were the one who claimed that “CRDTs perform better than OT-based systems.”

Ah, I assumed we were talking about Jupiter based OT systems - which are outperformed by their newer cousins (like eg-walker). Like you say, these use a different data structure to transform changes and that's why they're faster.

> My reasoning is that CRDTs require idempotence and commutativity, while OTs do not.

The only property not required by a centralized OT system is the OT TP2 property. Ie, T(op3, op1 + T(op2, op1) == T(op3, op2 + T(op1, op2)). Central servers also give you a single global ordering.

If you discard TP2 and add global ordering, does that open the door to new optimisations? I don't know, and I certainly can't prove the absence of any such optimisations. So I think the burden of proof is on you.


The root of our misunderstanding or debate is clear: although CRDT is fairly well defined, I don’t think the same is true for OT.

What I have in mind is what I mentioned earlier:

> OT can be id-based, in which case operations are transformed directly on the document, not on other operations.

This is exactly what I do in my library DocNode[1], which I describe as “id-based OT”.

With this model, it’s not even necessary to satisfy TP1. In fact, the concept T(o1, o2) doesn’t exist, because operations aren’t “transformed” against other operations, but against the document. Maybe the word “transform” is a bit misleading, and “apply” would be more appropriate. The problem is that there is still a slight transformation. For example, if a client inserts a node between nodes A and B, but by the time it reaches the server B has been deleted, the effective operation might become “insert between A and C”.

The server is append-only. The client has several options to synchronize with the server: rebasing, undo-do-redo, or overwriting the document.

Maybe I’m the one who shouldn’t describe DocNode as “[id-based] OT” and should instead coin a new term. Operational Application (OA)? Operations Without Transformation (OWT)? Operations Directly Transformed (ODT)? Operational Rebasing (OR)? Not sure. What would you recommend?

[1] https://www.docukit.dev/docnode


The biggest evidence for collaborative editing is the immense popularity of Google Docs, Notion and Figma.

Just because programming code isn't a good use case for automated conflict resolution doesn't mean everything else isn't.

Just imagine non-technical people using git to collaborate on a report, essay, or blog post. It's never going to happen.


> (1) we found (contrary to popular belief) that OT actually does not require a centralized server

In theory, yes, but in practice, any OT that operates without a central server (or master peer) essentially ends up being a CRDT. A CRDT is a subset of OT, specifically one that adds the requirement of P2P support.

> (2) we found it to be harder to implement OT exactly right vs CRDTs

I would say that each has its own complexity in different areas. CRDT's complexity lies in its data structure and algorithm, while OT's lies in its sync engine (since it must handle race conditions and guarantee deterministic ordering). In my opinion, OT is simpler overall. Hopefully DocNode and DocSync will make OT even easier.

> (3) we found many (though not all) of the problems that CRDTs have, are also problems in practice for OT

Oh, definitely not! OT has many benefits[1]. I think the misconception stems from the common belief that OT should be positional, rather than id-based. In the first case, operations are transformed on other operations. In the second case, operations can also be transformed on the current document (O(1)), eliminating the problems commonly associated with OT. This is the approach I use in DocNode.

> the problems CRDTs have in general are vastly worse to the end-user experience.

This is 100% correct.

____

https://www.docukit.dev/docnode#how-does-it-compare-to-yjs


> You don't think Figma is a serious application?

I don't know where this popular belief came from. The Figma blog literally says "Figma isn't using true CRDTs"[1].

> The only real benefit of OT is that its simpler to reason about.

That's incorrect. When you free yourself from the P2P restriction that CRDTs are subject to, there's a huge amount of metadata you can get rid of, just to mention one benefit.

[1] https://www.figma.com/blog/how-figmas-multiplayer-technology...


> I don't know where this popular belief came from.

It may be worth reading the whole paragraph of the blog you referenced...

> Figma isn't using true CRDTs though. CRDTs are designed for decentralized systems where there is no single central authority to decide what the final state should be. There is some unavoidable performance and memory overhead with doing this. Since Figma is centralized (our server is the central authority), we can simplify our system by removing this extra overhead and benefit from a faster and leaner implementation.

> It’s also worth noting that Figma's data structure isn't a single CRDT. Instead it's inspired by multiple separate CRDTs and uses them in combination to create the final data structure that represents a Figma document (described below).

So, it's multiple CRDTs, not just one. And they've made some optimizations on top, but that doesn't make it not a CRDT?


Nowhere does it say it's multiple CRDTs. It says "isn't a single CRDT" and that "it's inspired by multiple separate CRDTs." A bit confusing, I agree.

By the way, I work at Figma.


> When you free yourself from the P2P restriction that CRDTs are subject to...

CRDT stands for "Conflict-free replicating data type". They're just, any data type with an idempotent, commutative merge function defined on the entire range of input. Technically, CRDTs have nothing to do with networks at all.

The simplest "real" CRDT is integers and the MAX function. Eg, you think of an integer. I think of an integer. We merge by taking the max of our integers. This is a CRDT. Would it have eventual consistency? Yes! Would it work in a server-to-client setup? Of course! Does it need any metadata at all? Nope! It doesn't even need versions.

There's no such thing as a "P2P restriction". Anything that works p2p also works server-to-client. You can always treat the server and client as peers.

> there's a huge amount of metadata you can get rid of, just to mention one benefit.

Can you give some examples of this metadata? In my experience (10-15 years with this stuff now), I've found you can get the same or better performance out of CRDTs and OT based systems if you're willing to make the same set of tradeoffs.

For example, OT has a tradeoff where you can discard old operations. The cost of doing so is that you can no longer merge old changes. But, you can do exactly the same thing in CRDTs, with the same cost and same benefits. Yjs calls this "garbage collection".

In my own eg-walker algorithm, there's 3 different ways you can do this. You can throw away old operations like in OT based systems - with the same cost and benefit. You can keep old metadata but throw away old data. This lets you still merge but you can't see old versions. Or you can keep old edits on the server and only lazy-load them on the client. Clients are small and fast, and you have full change history.

CRDTs generally give you more options. But more options = more complexity.

I'm no p2p idealist. Central servers definitely make some things easier, like access control. But CRDTs still work great in a centralised context.


Ok, replace "P2P restriction" with "idempotent, commutative restriction".

> For example, OT has a tradeoff where you can discard old operations. The cost of doing so is that you can no longer merge old changes.

Why wouldn’t you be able to? My server receives operations, applies them to the document, and discards them. It can receive operations as old as it wants.

___

> Can you give some examples of this metadata?

Yes, it depends on the CRDT, but if we're talking about lists or tree structures with insert and delete operations, these can come in the form of thombstones, or operation logs, or originRight or originLeft, or DAG. Even with a garbage collector, the CRDT needs to retain some of this metadata.

Yes, you can optimize by not bringing it into memory when it’s not needed. But they’re still there, even though they could be avoided entirely if you assume a central server that guarantees a deterministic ordering of operations.


> Why wouldn’t you be able to? My server receives operations, applies them to the document, and discards them. It can receive operations as old as it wants.

OT needs to do operation transformation. If you get passed old operations, you need to transform them before they can be applied. This requires keeping extra data around. I mean, this is entirely the point of the metadata you describe in the second part of your comment:

> these can come in the form of thombstones, or operation logs, or originRight or originLeft, or DAG. Even with a garbage collector, the CRDT needs to retain some of this metadata.

> But they’re still there, even though they could be avoided entirely if you assume a central server that guarantees a deterministic ordering of operations.

A central server doesn't remove the need for this data though! Lets assume a completely deterministic ordering of operations on the server, like Jupiter. If the server is at version 1000, and I send the server an operation at version 900, the server needs to use information about operations 900-1000 to be able to apply the change. This is true in Jupiter OT - which uses the actual operations. Or in Yjs or Diamond Types or any other collaborative text editing system. We need some of that information to figure out how to transform the incoming change and order it correctly.

At least, I've never seen or heard of any scheme which doesn't need some data like this.


> A central server doesn't remove the need for this data though!

Yes, it's possible. The problem is that we're using different definitions of what OT means. This conversation has converged on the same point we started in another thread; I suggest we continue it there:

https://news.ycombinator.com/item?id=47436249


Author of DocNode here. Yes, it’s still early days. But it’s a very robust library that I don’t expect will go through many breaking changes. It has been developed privately for over 2 years and has 100% test coverage. Additionally, each test uses a wrapper to validate things like operation reversibility, consistency across different replicas, etc.

DocSync, which is the sync engine mainly designed with DocNode in mind, I would say is a bit less mature.

I’d love it if you could take a look and see if there’s anything that doesn’t convince you. I’ll be happy to answer any questions.


I've looked through the site, and right now it's probably the thing I'd try out first, but my main concerns are the missing documentation, particular the more cookbook-y kinds of documentation — how you might achieve such-and-such effect, etc. For example, the sync example is very terse, although I can understand why you'd like to encourage people to use the more robust, paid-for solution! Also just general advice on how to use DocNode effectively from your experience would be useful, things like schema design or notes about how each operation works and when to prefer one kind of operation or structure over another.

All that said, I feel like the documentation has improved since the last time I looked, and I suspect a lot of the finer details come with community and experience.


Thanks! I've recently made some improvements to the documentation. I agree the synchronization section could be improved more. I'll keep your feedback in mind. If you'd like to try the library, feel free to ask me anything on Discord and I'll help you.

I remember reading Part 1 back in the day, and this is also an excellent article.

I’ve spent 3+ years fighting the same problems while building DocNode and DocSync, two libraries that do exactly what you describe.

DocSync is a client-server library that synchronizes documents of any type (Yjs, Loro, Automerge, DocNode) while guaranteeing that all clients apply operations in the same order. It’s a lot more than 40 lines because it handles many things beyond what’s described here. For example:

It’s local-first, which means you have to handle race conditions.

Multi-tab synchronization works via BroadcastChannel even offline, which is another source of race conditions that needs to be controlled.

DocNode is an alternative to Yjs, but with all the simplicity that comes from assuming a central server. No tombstones, no metadata, no vector clock diffing, supports move operations, etc.

I think you might find them interesting. Take a look at https://docukit.dev and let me know what you think.


Hello again Germán! Since the product we make is, basically, a local-first markdown file editor, I would humbly suggest that the less-well-known algorithm we recommend is thus also local-first. But, I fully believe that you do a ton of stuff that we don't, and if we had known about it at the time, we very definitely would have taken a close look! We did not set out to do this ourselves, it just kind of ended up that way.

Cool! We also build client-server sync for our local-first CMS: https://github.com/valbuild/val Just as your docsync, it has to both guarantee order and sync to multiple types of servers (your own computer for local dev, cloud service in prod). Base format is rfc 6902 json patches. Read the spec sheet and it is very similar :)

Looks really cool, I would love to use it in my DollarDeploy project. Documentation could be a bit better still, it is not clear, are content is pure markdown or it is typescript files? Which GitHub repo it synchronizes to? I prefer monorepo approach.

Awesome feedback! Will update the docs! The content is TS files. You can chose which repo GitHub you want to synchronize to - monorepo also works!

Should add: you can read more docs here: https://val.build/docs/create

Tiny fail at undo: insert 1 before E, Ctlr+Z, move left/right: left editor moves around E, right editor moves around the nonexistent 1

And for real "action" there should be a delay/pause button to simulate conflicts like the ones described in the blog


Yes, the undo issue is a known bug in the website demo because it's messing with Lexical's undo functionality. It's not actually a DocNode bug. I'll fix it soon.

The feedback about the delay/pause button is also good, thanks!


Part 1 discussion, December 2024: https://news.ycombinator.com/item?id=42343953

At least Yjs, Loro and Automerge must handle some degree of operation reordering. Either causally-consistent or virtually any order.

The tombstone problem is exactly why I built DocNode. You're right that you can't compact them without consensus, so DocNode just doesn't create them. It assumes a server decides the order, which is already the case in 99% of collaborative apps. The result: a doc stays the same size whether it was edited once or a thousand times. Yjs for the same doc grows every time someone types, pastes, or undoes. Check it out: https://docukit.dev


Interesting! I'm the author of https://docnode.dev and I see some similar concepts


Thanks for the advice. I'll try it when the post is a week old. I hope it doesn't get rejected.


I don't think it will. I know a few folks who have done the same. It' is part of the T&C's that it is ok to re-submit just not super fast.


Thank you so much for your help! I followed your advice:

https://news.ycombinator.com/item?id=46211551


Sorry that it didn't get any traction. It's a bit of a crap shoot with these things. Looks good to me tho. :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: