Last week, we shipped an initial version of Memo JS, a light-client implementation of Memo that can be used as a library in web-based applications. To start with, we're assuming that the file system is completely virtual and that all changes are routed directly through the library. This meant that we ended up temporarily shelving a lot of the work we did to synchronize our tree CRDT with an external file system, but we still plan to take advantage of that research in order to build the full client that's capable of observing an external repository. Shipping the light client first will hopefully let us get some feedback and iterate on other aspects of the protocol's design before introducing the complexity of interoperating with an external file system.
Currently, a Memo WorkTree
always starts at a base commit and builds forward indefinitely with operations. We assume that application code will be responsible for tearing the work tree down and rebuilding it following a commit. The next step is pull this concern into Memo itself and to allow the base commit of a replicated work tree to change over time due to operations on the underlying repository such as committing, resetting, and checking out different branches.
We're still in the middle of figuring this out. It's murky and our thinking is still in flux. We're focused on the light client currently, which simplifies our API and reduces complexity, but we still want a design that will work when we do eventually synchronize to the file system. It's somewhat unclear whether we should just start focusing on integrating with the file system now, or alternatively completely ignore the concerns of the file system and hope we can make adjustments later. For now though, here's what is emerging.
We divide the evolution of the work tree into epochs. Each epoch begins with a specific commit from the underlying repository that gives all replicas a common frame of reference, then applies additional operations on top to represent uncommitted changes in that epoch. There is one and only one active epoch at any time on a given replica. All operations are tagged with an epoch id, and the local counters used to identify operations are reset to zero at the start of each epoch. Someone joining the collaboration should only need to fetch operations associated with the most recent epoch.
When a user performs a local Git operation such as a commit or a reset, they broadcast the creation of a new epoch. Because users can create new epochs concurrently, we always honor the epoch creation with the most recent Lamport timestamp at every replica, which will provide an arbitrary but consistent behavior for concurrent epoch creations while also respecting causality in the sequential case.
Collaborators can reset the HEAD of the working copy to an arbitrary ref. In that case, we need to create a new epoch. Depending on the nature of the reset and the state of the file system, there may be uncommitted changes on disk. We'd also like to incorporate the concept of unsaved changes when we integrate with the file system. Both uncommitted changes and unsaved changes will need to be translated into synthetic operations that build upon the new epoch's base commit.
When the epoch creation arrives at remote replicas, it seems like they will have no choice but to perform I/O in order to scan the epoch's base entries into the tree. The base state of open buffers may also need to be re-read, and some of these open buffers may be for files that no longer exist in the new epoch's base commit.
This is where things start to feel pretty messy and confusing. What happens to these "untethered" buffers? Do we empty out the tree and build it back up as we perform I/O on the base entries, or do we preserve the old state until the new state is ready. How do races with the file system complicate all of this?
Commits create a new epoch whose state is derived from a previous epoch, although due to the potential for concurrent commits and resets, a commit doesn't always derive from the active epoch on a given replica. Ignoring the potential for partial staging for the moment, when a user creates a commit, we can characterize what they committed via a version vector that includes all observed operations in the current epoch.
If a replica receives a commit based on the active epoch (which should be the most common case), we should be able to determine their base entries without performing I/O. This is because the state that was committed should already be available as a subset of operations they have already seen, as characterized by the version vector. This would allow us to update the tree to its new state synchronously in a very common case.
On the other hand, there's no guarantee that a commit is going to based on the active epoch thanks to diabolical concurrency scenarios, and this seems to mean that we may end up needing to do I/O anyways in some scenarios. That makes us wonder whether we should focus first on the ability to reset the base commit in arbitrary ways and treat commits as a special case of that.
This is a hard problem. We've made it through one wave of complexity to encounter another, and presumably that will continue. Every decision seems to be entangled with everything else, and even this summary just scratches the surface of the thought process behind this problem. But despite the daunting complexity, I'm still excited by the idea of a fully-replicated Git working copy. Git operations are the next summit to climb, and I imagine there will be more wilderness before we can settle in the fertile valley of conflict free replicated paradise.