- My Development Notes/
- Distributed Coordination: The Hidden Component/
- Coordination as a Hidden Component/
Coordination as a Hidden Component
Table of Contents
Coordination as a Hidden Component #
Draw a system design diagram for Ticketmaster. You will draw boxes for a load balancer, an API gateway, a database, a cache, a message queue, a search index. You will label them. But when the interviewer asks how two users are prevented from booking the same seat simultaneously, the answer involves a coordination mechanism that has no box in the diagram. It is implicit. It is assumed. It is handled somewhere.
This is the defining characteristic of distributed coordination: it is the component that every distributed system needs and no one explicitly names.
Why There Is No Canonical Protocol #
Every other component in a system design diagram has a canonical wire protocol that multiple implementations adopt:
- Message broker → Kafka wire protocol (Redpanda implements it independently)
- Search index → OpenSearch REST/DSL (Elasticsearch implements the same protocol)
- Database → PostgreSQL wire protocol (CockroachDB, Neon, RisingWave implement it)
Coordination has no equivalent. Each system implements coordination from scratch, under its own protocol:
| System | Coordination mechanism | Protocol |
|---|---|---|
| Kafka | KRaft — leader election, ISR | Proprietary |
| Cassandra | Gossip + Paxos variants | Proprietary |
| Kubernetes | etcd leases and watches | etcd gRPC |
| CockroachDB | Raft per range | Proprietary |
| Redis Cluster | Cluster bus gossip | Proprietary |
| ZooKeeper | ZAB (ZooKeeper Atomic Broadcast) | Proprietary |
The reason is structural. Unlike message brokers — which are external services consumed by applications — coordination is typically embedded infrastructure. Kafka does not call etcd for leader election; it implements KRaft internally. CockroachDB does not call ZooKeeper; it runs Raft per range internally. Every system builds coordination rather than consuming a standard protocol, because:
- Performance requirements vary too much — linearizable reads for etcd, high-throughput ordering for Kafka, per-range consensus for CockroachDB
- Coordination is a prerequisite, not a feature — it cannot be behind a network hop in the hot path
- There is no “implement the coordination protocol and get clients for free” incentive the way implementing the PostgreSQL wire protocol gives you all PostgreSQL tooling
The Coordination Problem Appears Across System Design #
Coordination is present — implicitly — in approximately 40% of canonical system design problems:
| System | Hidden coordination need |
|---|---|
| Ticketmaster | Seat hold — prevent double-booking during checkout |
| Online Auction | Bid acceptance — exactly one winner among concurrent bids |
| Job Scheduler | Leader election — only one node triggers each job |
| Google Docs | Collaborative editing — concurrent conflict resolution |
| Payment System | Idempotency — exactly-once execution across retries |
| Rate Limiter | Distributed counting — the problem is coordination itself |
| Web Crawler | URL frontier — distributed deduplication and assignment |
| Uber | Driver assignment — exactly one driver per ride |
| Robinhood | Order processing — exactly-once trade execution |
In every case, the standard answer is “use Redis for locking” or “use ZooKeeper” — a component name, never a protocol, never an analysis of why that mechanism is correct or what goes wrong if it is not.
The Analytical Framework #
Before selecting a mechanism, any coordination problem can be analyzed with six questions:
1. Who participates? → Membership
How many actors? Are they homogeneous (identical nodes) or heterogeneous (coordinator and workers)? Is membership static or dynamic (nodes join and leave)? What happens when members fail?
You cannot run consensus, assign quorums, or distribute work without knowing who is in the system. Membership is the substrate that all other coordination depends on.
2. What is the structure? → Shape
What is the topology of the coordination?
- Centralized: One coordinator, N participants (2PC, leader-follower)
- Ring: Ordered participant set (token ring, some consensus variants)
- Mesh: Every node talks to every other (gossip, Raft)
- Hierarchical: Coordinator delegates to sub-coordinators (Kafka consumer groups with a group coordinator)
Shape determines communication cost and single points of failure.
3. What needs resolving? → Coordination mechanism
The core question. One of six mechanisms:
| Mechanism | The problem |
|---|---|
| Selection | Multiple actors compete for a finite resource — who wins? |
| Agreement | Multiple participants must all commit or all abort — how do they agree? |
| Aggregation | Values are spread across nodes/users — how are they combined? |
| Assignment | Work must be distributed across participants — who handles what? |
| Suppression | Operations may be duplicated — how are duplicates eliminated? |
| Convergence | State diverges across replicas — how does it reconcile without central coordination? |
4. How is it resolved? → Algorithm and protocol
Given the mechanism, which algorithm? Raft for agreement, consistent hashing for assignment, HyperLogLog for approximate aggregation, CRDTs for convergence. The algorithm must satisfy the required properties of the mechanism.
5. What if it fails? → Reliability
What are the failure modes? Network partition, node crash, clock drift, GC pause. Every mechanism must state:
- Safety property: What must never happen (two leaders, double charge, split-brain)
- Liveness property: What must eventually happen (a leader is elected, payment commits)
FLP impossibility establishes that no deterministic algorithm can simultaneously guarantee both safety and liveness in an asynchronous system with even one possible failure. Every mechanism makes an explicit tradeoff.
6. How fast must it work? → QoS
Latency, throughput, availability requirements. These constraints determine whether a single mechanism can serve both throughput and correctness needs, or whether layering is required (a fast approximate layer over a slow authoritative layer).
The Mechanism Taxonomy in Detail #
Selection #
Selection resolves a race: multiple actors compete for a finite resource and exactly one must win.
Infrastructure manifestations:
- Raft leader election: one node among N becomes the leader for a term
- etcd distributed lock: one process among N holds the lock at any time
Application manifestations:
- Seat reservation: one user among N concurrent buyers holds a seat
- Driver assignment: one driver among N available is assigned to a ride
Key properties required:
- Exactly one winner — no split-brain
- Fencing — a stale winner (whose lease expired) must be prevented from acting as though they still won
First-write-wins (FWW) and last-write-wins (LWW) are the two resolution policies. FWW is appropriate when the first actor to claim a resource should win (seat reservation). LWW is appropriate when the most recent write should win (document versioning).
Agreement #
Agreement resolves a commit decision: multiple participants must all commit or all abort an operation.
Infrastructure manifestations:
- Raft log replication: all replicas must agree on the same log entry at the same index
- Two-phase commit: all participants in a distributed transaction must agree to commit
Application manifestations:
- Payment + fulfillment: payment service and order service must both succeed or both roll back
- Booking + inventory: reservation and stock decrement must be atomic
Key properties required:
- Atomicity: all commit or all abort — no partial commits
- Durability: committed decisions survive crashes
Agreement is the mechanism most constrained by the CAP theorem. In the presence of a network partition, you must choose between consistency (wait for agreement, possibly block) and availability (proceed without agreement, risk divergence).
Aggregation #
Aggregation resolves a computation: values distributed across nodes must be combined into a single result.
Infrastructure manifestations:
- OpenSearch scatter-gather: partial results from each shard merged by the coordinating node
- Cassandra quorum read: responses from R replicas merged by the coordinator
Application manifestations:
- Rate limiter: request counts from all nodes combined to determine if limit is exceeded
- Ad click aggregator: click counts from all edge nodes summed for billing
- YouTube Top-K: view counts from all shards merged for leaderboard
Key properties:
- Exact or approximate: exact aggregation requires all values; approximate (HyperLogLog, Count-Min Sketch) is sufficient for many use cases
- Mergeable: partial results must be combinable — merging two HLL sketches is a union operation
Assignment #
Assignment resolves a distribution problem: work, data, or responsibility must be allocated across participants.
Infrastructure manifestations:
- Cassandra consistent hashing: keys assigned to nodes by position on hash ring
- Kafka partition assignment: partitions assigned to consumer group members
- etcd range assignment: key ranges assigned to nodes in distributed KV stores
Application manifestations:
- Job scheduler: tasks assigned to available workers
- Web crawler: URL domains assigned to crawler instances
Key properties:
- Stability: the same key should map to the same node across time (consistent hashing achieves this)
- Minimal disruption: when membership changes, only the minimum necessary work should move
Suppression #
Suppression resolves a duplicate problem: an operation that may be executed multiple times must produce the same effect as executing once.
Infrastructure manifestations:
- Kafka idempotent producer: duplicate messages filtered using producer ID + sequence number
- etcd watch: each event has a unique revision; clients can replay from a revision without duplicate processing
Application manifestations:
- Payment idempotency: the same charge attempted twice should produce one charge
- Web crawler deduplication: a URL encountered multiple times should be crawled once
Key properties:
- Idempotency: the operation produces the same result regardless of how many times it executes
- Deduplication window: how long must the system remember that an operation was already processed?
Convergence #
Convergence resolves a divergence problem: replicas that accepted writes independently must eventually reach identical state without central coordination.
Infrastructure manifestations:
- Cassandra eventual consistency: replicas diverge during partition, converge via anti-entropy repair
- Redis Cluster gossip: node metadata propagates and converges across the cluster bus
Application manifestations:
- Google Docs: concurrent edits from multiple users merge into a consistent document
- Dropbox: file edits on multiple devices synchronize to the same final state
Key properties:
- Commutativity: merge(A, B) = merge(B, A) — order of merging does not matter
- Associativity: merge(merge(A, B), C) = merge(A, merge(B, C)) — grouping does not matter
- Idempotency: merge(A, A) = A — merging with the same state twice is safe
The Two Levels #
The mechanisms operate at two levels that are structurally distinct:
Infrastructure coordination answers: how do nodes in a distributed system coordinate with each other?
- Small number of participants (hundreds, not millions of concurrent operations)
- Errors are unrecoverable — split-brain in leader election corrupts data permanently
- Correctness wins over throughput — etcd accepts write latency to guarantee linearizability
Application coordination answers: how do concurrent user requests resolve?
- Large number of participants (millions of concurrent user operations)
- Errors are often recoverable — a failed seat hold releases via TTL, a failed payment can be retried with an idempotency key
- Throughput wins for the fast path, correctness is enforced at the commitment layer
The same mechanism (Selection) appears at both levels — Raft leader election at the infrastructure level, seat reservation at the application level — but with different implementations, different guarantees, and different failure modes.
How DDD and Event Storming Surface Coordination Concerns #
Domain-Driven Design and Event Storming are discovery tools. They do not specify coordination mechanisms, but they surface the coordination problems that need them.
The mapping:
| Event Storming artifact | Coordination mechanism |
|---|---|
| Aggregate boundary | Agreement — atomic consistency within |
| Cross-aggregate command | Agreement (Saga) or Suppression (idempotency) |
| Competing commands on same aggregate | Selection |
| Hotspot with multiple concurrent actors | Selection or Agreement |
| Eventual consistency between bounded contexts | Convergence |
| Policy (when X, do Y) | Notification (watch) |
| Read model / projection | Aggregation |
| External system boundary | Suppression (idempotency at entry point) |
The coordination framework translates the “what” surfaced by Event Storming into the “how” required by implementation: the mechanism type, its required properties, and the infrastructure or application pattern that satisfies them.