Convergent State

Our distributed system handles conflicts, but handling conflicts is work. What if conflicts were impossible? What if concurrent changes always merged cleanly, regardless of order?

This is the promise of CRDTs---Conflict-free Replicated Data Types. Data structures mathematically designed so that concurrent modifications automatically converge to the same state.

The Commutative Insight

Look back at our change operations:

    Change =
        | Set { path, value }
        | Increment { path, delta }
        | Append { path, element }

Set conflicts: if Alice sets score = 100 and Bob sets score = 200, which wins? There's no natural merge.

But Increment is different: if Alice increments by 5 and Bob increments by 3, the result is increment by 8---regardless of order. These operations commute.

python
    // These give the same result
    increment(+5) then increment(+3) → +8
    increment(+3) then increment(+5) → +8

This is the essence of CRDTs: design operations that commute, so order doesn't matter.

A CRDT Counter

The simplest CRDT: a counter where each client tracks its own increments.

python
    GCounter:      // "G" for "grow-only"
        increments: { 
            clientA: 5, 
            clientB: 3, 
            clientC: 2 
        }
        
        value(): 
            return sum(values(increments))  // 10
        
        increment(client, amount):
            increments[client] += amount
        
        merge(other):
            for client in keys(other.increments):
                // Take the max - handles out-of-order delivery
                increments[client] = max(
                    increments[client] or 0,
                    other.increments[client]
                )

Why does this work? Each client only increments its own entry. The maximum operation is:

  • Commutative: max(a,b)=max(b,a)max(a, b) = max(b, a)
  • Associative: max(max(a,b),c)=max(a,max(b,c))max(max(a, b), c) = max(a, max(b, c))
  • Idempotent: max(a,a)=amax(a, a) = a

These properties mean we can merge in any order, any number of times, and get the same result.

Positive-Negative Counter

What about decrement? We use two grow-only counters:

wollok
    PNCounter:     // Positive-Negative Counter
        positive: GCounter
        negative: GCounter
        
        value():
            return positive.value() - negative.value()
        
        increment(client, amount):
            positive.increment(client, amount)
        
        decrement(client, amount):
            negative.increment(client, amount)
        
        merge(other):
            positive.merge(other.positive)
            negative.merge(other.negative)

CRDT Sets

Sets are trickier. Adding is easy; removing is hard.

Grow-Only Set (G-Set)

wollok
    GSet:
        elements: Set
        
        add(element):
            elements.insert(element)
        
        contains(element):
            return elements.has(element)
        
        merge(other):
            elements = union(elements, other.elements)

Union is commutative, associative, and idempotent. Perfect for CRDTs.

Two-Phase Set (2P-Set)

To support removal, track adds and removes separately:

wollok
    TwoPhaseSet:
        added: GSet
        removed: GSet    // "tombstone" set
        
        add(element):
            added.add(element)
        
        remove(element):
            if added.contains(element):
                removed.add(element)
        
        contains(element):
            return added.contains(element) 
               and not removed.contains(element)
        
        merge(other):
            added.merge(other.added)
            removed.merge(other.removed)

Observed-Remove Set (OR-Set)

For re-adding after removal, we use unique tags:

python
    ORSet:
        elements: Map<Element, Set<Tag>>
        
        add(element):
            tag = generateUniqueTag()
            elements[element].add(tag)
        
        remove(element):
            // Remove all current tags for this element
            elements[element] = emptySet()
        
        contains(element):
            return elements[element].size > 0
        
        merge(other):
            for element in allElements(this, other):
                elements[element] = union(
                    elements[element],
                    other.elements[element]
                )

Last-Writer-Wins Register

For arbitrary values (not just counters), we can use timestamps:

yaml
    LWWRegister:
        value: any
        timestamp: number
        
        set(newValue):
            value = newValue
            timestamp = now()
        
        get():
            return value
        
        merge(other):
            if other.timestamp > timestamp:
                value = other.value
                timestamp = other.timestamp

This converges (highest timestamp wins) but may lose data. Use carefully.

CRDTs for Our Chess Game

How might CRDTs help with chess?

python
    // Move history as a grow-only log
    MoveHistory:
        moves: GSet<Move>   // each move has unique ID
        
        addMove(move):
            moves.add({ id: uniqueId(), ...move })
        
        getMoves():
            return sortByTimestamp(moves.elements)

For the board state, we might use an LWW-Map:

sql
    ChessBoard:
        squares: LWWMap<Square, Piece?>
        
        movePiece(from, to):
            piece = squares.get(from)
            squares.set(from, null, now())
            squares.set(to, piece, now())

The Trade-offs

CRDTs aren't free:

Space overhead: Tracking per-client state, tombstones, and tags adds memory.

Complexity: CRDT implementations are subtle. Getting them right is hard.

Semantic limitations: Not every operation can be made conflict-free. Some business logic inherently conflicts.

Garbage collection: Tombstones and old metadata accumulate. Cleaning them up safely is complex.

\textwidth
Traditional SyncCRDTs
ConflictsMust be detected and resolvedImpossible by design
OfflineChanges queue; conflicts on reconnectChanges merge seamlessly
ServerRequired for coordinationOptional (peer-to-peer possible)
ComplexityIn conflict handlingIn data structure design

When to Use CRDTs

CRDTs shine when:

  • Offline operation is important
  • Peer-to-peer sync is needed
  • Conflict resolution should be automatic
  • Operations naturally commute (counters, sets, text editing)

Traditional sync is simpler when:

  • Always-online is acceptable
  • A central server is available
  • Conflicts need human resolution
  • Operations don't naturally commute

The Frontier

CRDTs are an active research area. Modern developments include:

Operation-based CRDTs: Instead of merging state, merge operations. More efficient when operations are small.

Delta CRDTs: Send only what changed since last sync, not the full state.

Pure CRDTs: Automatically derive CRDT behavior from data type definitions.

CRDT composition: Building complex CRDTs from simpler ones, like building programs from functions.

python
    // A shopping cart as composed CRDTs
    ShoppingCart:
        items: ORSet<ItemId>
        quantities: Map<ItemId, PNCounter>
        
        addItem(itemId, quantity):
            items.add(itemId)
            quantities[itemId].increment(quantity)
        
        removeItem(itemId):
            items.remove(itemId)

We've seen how CRDTs make conflicts mathematically impossible. But metaprogramming---code examining and modifying code---raises deeper questions. What are the limits of self-reference? When does the mirror become dangerous? The next chapter explores the philosophical edges of reflection.