Distributed Coordination: The Hidden Component
Table of Contents
Every distributed system coordinates. When Ticketmaster prevents double-booking, when a payment processor charges exactly once across retries, when Google Docs merges concurrent edits from two users into the same document — coordination is happening. But unlike message brokers (Kafka), search indices (OpenSearch), or databases (PostgreSQL), coordination has no canonical wire protocol. Every system implements it independently, from scratch, under different names.
This series makes that hidden layer explicit. It names the mechanisms, traces them through infrastructure (etcd, Raft, Cassandra) and application logic (Ticketmaster, Payment, Uber), and shows the failure modes that arise when the mechanisms are applied incorrectly or at the wrong level.
The Framework #
Before reaching for a specific mechanism, any coordination problem can be analyzed with six questions:
- Who participates? → Membership
- What’s the structure? → Shape (centralized, ring, mesh)
- What needs resolving? → Coordination mechanism
- How is it resolved? → Algorithm and protocol
- What if it fails? → Reliability and recovery
- How fast must it work? → QoS constraints
Question 3 maps to six core mechanisms:
| Mechanism | Question answered |
|---|---|
| Selection | Multiple actors compete for a resource — who wins? |
| Agreement | Multiple participants must commit to the same decision — how do they agree? |
| Aggregation | Values are spread across nodes — how are they combined? |
| Assignment | Work must be distributed — who handles what? |
| Suppression | Operations may be duplicated — how are duplicates eliminated? |
| Convergence | State diverges across replicas — how does it reconcile? |
Chapters #
Part 1: Infrastructure Coordination #
- Coordination as a Hidden Component — the framework, the taxonomy, why no uniform protocol exists
- Membership — Knowing Who Is Alive — gossip protocols, Phi accrual failure detection, join and leave propagation
- Selection — Leader Election and Distributed Locks — Raft leader election, fencing tokens, etcd leases, why Redis locks are unsafe for correctness
- Agreement — Consensus, Quorum, and Transactions — Raft log replication, two-phase commit, the blocking problem, Saga as an alternative
- Assignment — Consistent Hashing and Work Distribution — hash rings, virtual nodes, Cassandra’s coordination model, rebalancing under membership change
- Ordering — Sequence Numbers and Total Order — Lamport clocks, vector clocks, total order broadcast, how etcd’s revision counter enables ordering
- Aggregation — Scatter-Gather and Distributed Query Execution — OpenSearch scatter-gather, Cassandra quorum read merge, CockroachDB DistSQL, Kafka Streams windowed aggregation
- Suppression — Exactly-Once at the Infrastructure Layer — Kafka idempotent producer, transactions, etcd watch revision guarantees, Raft log deduplication
Part 2: Application Coordination #
- Selection at Application Level — Reservations and Assignments — seat reservation (Ticketmaster), driver assignment (Uber), optimistic locking, two-phase reservation
- Agreement at Application Level — Sagas, Compensation, and Cross-Service Atomicity — Saga design for e-commerce and booking, compensation patterns, isolation gap, orchestration vs choreography
- Assignment at Application Level — Work Distribution and Routing — task queues, job scheduler exactly-one execution, web crawler domain assignment, database sharding
- Suppression — Idempotency and Deduplication — idempotency key design, database-level idempotency, Bloom filters, the outbox pattern
- Aggregation — Distributed Counting and Rate Limiting — Count-Min Sketch, HyperLogLog, token bucket, sliding window, distributed rate limiting
- Convergence — CRDTs and Collaborative Editing — state-based vs operation-based CRDTs, G-Counter, OR-Set, Operational Transforms, Google Docs
- Bridging Intent and Commitment — the throughput-correctness split, TTL, CAS, fencing, compensation, sequence numbers as bridges
- When Infrastructure, When Application, When Layered — correctness vs throughput decision framework, the recoverable vs unrecoverable error boundary