Convergent State
Our distributed system handles conflicts, but handling conflicts is work. What if conflicts were impossible? What if concurrent changes always merged cleanly, regardless of order?
This is the promise of CRDTs---Conflict-free Replicated Data Types. Data structures mathematically designed so that concurrent modifications automatically converge to the same state.
The Commutative Insight
Look back at our change operations:
Change =
| Set { path, value }
| Increment { path, delta }
| Append { path, element } Set conflicts: if Alice sets score = 100 and Bob sets score = 200, which wins? There's no natural merge.
But Increment is different: if Alice increments by 5 and Bob increments by 3, the result is increment by 8---regardless of order. These operations commute.
// These give the same result
increment(+5) then increment(+3) → +8
increment(+3) then increment(+5) → +8This is the essence of CRDTs: design operations that commute, so order doesn't matter.
A CRDT Counter
The simplest CRDT: a counter where each client tracks its own increments.
GCounter: // "G" for "grow-only"
increments: {
clientA: 5,
clientB: 3,
clientC: 2
}
value():
return sum(values(increments)) // 10
increment(client, amount):
increments[client] += amount
merge(other):
for client in keys(other.increments):
// Take the max - handles out-of-order delivery
increments[client] = max(
increments[client] or 0,
other.increments[client]
)Why does this work? Each client only increments its own entry. The maximum operation is:
- Commutative:
- Associative:
- Idempotent:
These properties mean we can merge in any order, any number of times, and get the same result.
Positive-Negative Counter
What about decrement? We use two grow-only counters:
PNCounter: // Positive-Negative Counter
positive: GCounter
negative: GCounter
value():
return positive.value() - negative.value()
increment(client, amount):
positive.increment(client, amount)
decrement(client, amount):
negative.increment(client, amount)
merge(other):
positive.merge(other.positive)
negative.merge(other.negative)CRDT Sets
Sets are trickier. Adding is easy; removing is hard.
Grow-Only Set (G-Set)
GSet:
elements: Set
add(element):
elements.insert(element)
contains(element):
return elements.has(element)
merge(other):
elements = union(elements, other.elements)Union is commutative, associative, and idempotent. Perfect for CRDTs.
Two-Phase Set (2P-Set)
To support removal, track adds and removes separately:
TwoPhaseSet:
added: GSet
removed: GSet // "tombstone" set
add(element):
added.add(element)
remove(element):
if added.contains(element):
removed.add(element)
contains(element):
return added.contains(element)
and not removed.contains(element)
merge(other):
added.merge(other.added)
removed.merge(other.removed)Observed-Remove Set (OR-Set)
For re-adding after removal, we use unique tags:
ORSet:
elements: Map<Element, Set<Tag>>
add(element):
tag = generateUniqueTag()
elements[element].add(tag)
remove(element):
// Remove all current tags for this element
elements[element] = emptySet()
contains(element):
return elements[element].size > 0
merge(other):
for element in allElements(this, other):
elements[element] = union(
elements[element],
other.elements[element]
)Last-Writer-Wins Register
For arbitrary values (not just counters), we can use timestamps:
LWWRegister:
value: any
timestamp: number
set(newValue):
value = newValue
timestamp = now()
get():
return value
merge(other):
if other.timestamp > timestamp:
value = other.value
timestamp = other.timestampThis converges (highest timestamp wins) but may lose data. Use carefully.
CRDTs for Our Chess Game
How might CRDTs help with chess?
// Move history as a grow-only log
MoveHistory:
moves: GSet<Move> // each move has unique ID
addMove(move):
moves.add({ id: uniqueId(), ...move })
getMoves():
return sortByTimestamp(moves.elements)For the board state, we might use an LWW-Map:
ChessBoard:
squares: LWWMap<Square, Piece?>
movePiece(from, to):
piece = squares.get(from)
squares.set(from, null, now())
squares.set(to, piece, now())The Trade-offs
CRDTs aren't free:
Space overhead: Tracking per-client state, tombstones, and tags adds memory.
Complexity: CRDT implementations are subtle. Getting them right is hard.
Semantic limitations: Not every operation can be made conflict-free. Some business logic inherently conflicts.
Garbage collection: Tombstones and old metadata accumulate. Cleaning them up safely is complex.
| Traditional Sync | CRDTs | |
|---|---|---|
| Conflicts | Must be detected and resolved | Impossible by design |
| Offline | Changes queue; conflicts on reconnect | Changes merge seamlessly |
| Server | Required for coordination | Optional (peer-to-peer possible) |
| Complexity | In conflict handling | In data structure design |
When to Use CRDTs
CRDTs shine when:
- Offline operation is important
- Peer-to-peer sync is needed
- Conflict resolution should be automatic
- Operations naturally commute (counters, sets, text editing)
Traditional sync is simpler when:
- Always-online is acceptable
- A central server is available
- Conflicts need human resolution
- Operations don't naturally commute
The Frontier
CRDTs are an active research area. Modern developments include:
Operation-based CRDTs: Instead of merging state, merge operations. More efficient when operations are small.
Delta CRDTs: Send only what changed since last sync, not the full state.
Pure CRDTs: Automatically derive CRDT behavior from data type definitions.
CRDT composition: Building complex CRDTs from simpler ones, like building programs from functions.
// A shopping cart as composed CRDTs
ShoppingCart:
items: ORSet<ItemId>
quantities: Map<ItemId, PNCounter>
addItem(itemId, quantity):
items.add(itemId)
quantities[itemId].increment(quantity)
removeItem(itemId):
items.remove(itemId)We've seen how CRDTs make conflicts mathematically impossible. But metaprogramming---code examining and modifying code---raises deeper questions. What are the limits of self-reference? When does the mirror become dangerous? The next chapter explores the philosophical edges of reflection.