Posted by c17r 7 hours ago
conflict free merging sounds cool, but doesn't that just mean that that a human review step is replaced by "changes become intervals rather than collections of lines" and "last set of intervals always wins"? seems like it makes sense when the conflicts are resolved instantaneously during live editing but does it still make sense with one shot code merges over long intervals of time? today's systems are "get the patch right" and then "get the merge right"... can automatic intervalization be trusted?
edit: actually really interesting if you think about it. crdts have been proven with character at a time edits and use of the mouse select tool.... these are inherently intervalized (select) or easy (character at a time). how does it work for larger patches can have loads of small edits?
I recently built Artifact: https://www.paganartifact.com/benny/artifact
Mirror: https://github.com/bennyschmidt/artifact
In case anyone was curious what a full rewrite of git would look like in Node!
The main difference is that on the server I only store deltas, not files, and the repo is “built”.
But yeah full alternative to git with familiar commands, and a hub to go with it.
Well, isn't that what the CRDT does in its own data structure ?
Also keep in mind that syntactic correctness doesn't mean functional correctness.
There are many ways to instantiate a CRDT, and a trivial one would be "last write wins" over the whole source tree state. LWW is obviously not what you'd want for source version control. It is "correct" per its own definition, but it is not useful.
Anyone saying "CRDTs solve this" without elaborating on the specifics of their CRDT is not saying very much at all.
So as long as all updates have been sent to the server from all clients, it will know what “time” each character changed and be able to merge automatically.
Is that it basically?
Take a docx, write the file, parse it into entities e.g. paragraph, table, etc. and track changes on those entities instead of the binary blob. You can apply the same logic to files used in game development.
The hard part is making this fast enough. But I am working on this with lix [0].
Started with the machine learning use case for datasets and model weights but seeing a lot of traction in gaming as well.
Always open for feedback and ideas to improve if you want to take it for a spin!
Partial checkouts are awkward at best, LFS locks are somehow still buggy and the CLI doesn't support batched updates. Checking the status of a remote branch vs your local (to prevent conflicts) is at best a naive polling.
Better rebase would be a nice to have but there's still so much left to improve for trunk based dev.
When I was screwing around with the Git file format, tricks I would use to save space like hard-linking or memory-mapping couldn't work, because data is always stored compressed after a header.
A general copy-on-write approach to save checkout space is presumably impossible, but I wonder what other people have traveled down similar paths have concluded.
Is it actually okay to try to merge changes to binaries? If two people modify, say, different regions of an image file (even in PNG or another lossless compression format), the sum of the visual changes isn't necessarily equal to the sum of the byte-level changes.
IE if I change something in my data model, that change & context could be surfaced with agentic tooling.
It's been amazing watching it grow over the last few years.
You can choose to have a workflow where you're never directly editing any commit to "gain back autonomy" of the working copy; and if you really want to, with some scripting, you can even emulate a staging area with a specially-formatted commit below the working copy commit.