Table of Contents
Pattern Zoo: Distributed Systems Design Patterns #
A complete taxonomy of patterns organized by the functional requirement they fulfill. Every distributed system is a composition of patterns from these eight categories. Discovered and validated through dry-runs on Web Crawler, YouTube Top K, Dropbox, and Uber.
5+1 Primitives #
All patterns below are compositions of six primitives. If you understand these, you can derive the rest.
| Primitive | What It Does |
|---|---|
| Append-only Log | Ordered, immutable sequence of records. The foundation of durability and replication. |
| State Machine | Explicit states + transitions. Converts ambiguous process into auditable, restartable computation. |
| Hash Partition | Distribute load across N nodes by key hash. Enables horizontal scale at the cost of cross-partition queries. |
| Replication | Copy state to N nodes. Buys fault tolerance and read throughput; sells consistency. |
| Compare-and-Swap | Conditional write: update iff current value matches expected. The universal concurrency primitive (Herlihy consensus number ∞). |
| Clock | Assign causal order to events. Logical (Lamport), vector, or hybrid (HLC). Everything distributed depends on time. |
Translation Layer: FR → Pattern #
Mapping a functional requirement to a pattern is not a lookup — it is a traversal of five discriminant questions. Earlier questions are more load-bearing; getting Q1 wrong produces broken architecture, getting Q4 wrong produces performance problems.
FR
└── Q1 scope → narrows to a candidate cluster
└── Q2 failure → eliminates half the cluster
└── Q3 data → may collapse to a zero-cost solution
└── Q4 access → picks among read/write tradeoffs
└── Q5 coupling → finalizes event and sync boundary
Q1 — Coordination scope: where does contention happen? #
| Scope | Candidate patterns |
|---|---|
| Within a data structure | CAS, CRDT |
| Within a service (multiple instances) | Pessimistic Lock, Optimistic Lock, Lease |
| Across services | Saga (decomposable) / 2PC (not decomposable) |
| Across regions / async | Leaderless Replication, CRDT, Gossip |
Q2 — Failure model: what breaks? #
| Failure | Candidate patterns |
|---|---|
| Crash-stop | WAL + Checkpoint, Append-only Log, Idempotency Key |
| Network partition | Quorum, CRDT, Leaderless Replication, State Vector Sync |
| Slow degradation (not dead, just slow) | Circuit Breaker, Timeout, Bulkhead |
| Holder crashes while holding a resource | Lease (not Pessimistic Lock) |
Q3 — Data properties: what does the data allow? #
| Property | Implication |
|---|---|
| Commutative + associative operations | CRDT — coordination cost drops to zero |
| Immutable after write | Append-only Log, Event Sourcing — replay is safe |
| Content-addressable | Hash = natural idempotency key — dedup is free |
| Ordered by time or key | Range Partition, WAL, Windowing |
| Spatially structured | Spatial Partition |
Q3 can short-circuit the entire traversal. If the data is commutative, reach for CRDT before any lock-based pattern. If it is content-addressable, idempotency is solved at the storage layer without a separate idempotency table.
Q4 — Access pattern: how is it read and written? #
| Pattern | Implication |
|---|---|
| Read » Write | Cache-Aside, CQRS + Materialized View, Denormalization |
| Write » Read | Hash Partition, Leaderless Replication, Append-only Log |
| Read shape ≠ Write shape | CQRS (separate models) |
| Query is geospatial | Spatial Partition before any other read optimization |
| Query fans across shards | Scatter-Gather |
| Reads are time-bounded | Windowing, TTL, Temporal Decay |
Q5 — Coupling: how tightly must producer and consumer synchronize? #
| Coupling | Candidate patterns |
|---|---|
| Synchronous, atomic | 2PC, Pessimistic Lock, Timeout |
| Synchronous, best-effort | Retry + Backoff + Jitter, Circuit Breaker |
| Asynchronous, guaranteed delivery | Outbox + Relay, Message Queue |
| Asynchronous, source-of-truth is the DB row | CDC |
| Fan-out at write time | Fan-out on Write |
| Fan-out at read time | Fan-out on Read |
Worked example: shared document editing #
| Question | Answer | Elimination |
|---|---|---|
| Q1 scope | Multiple users, cross-region | Eliminates all lock-based patterns |
| Q2 failure | Network partition must not block editing | Eliminates OT (requires server round-trip); keeps CRDT |
| Q3 data | Sequence operations, not commutative by default | Requires sequence CRDT (YATA/RGA), not G-Counter |
| Q4 access | Each client holds full replica; reads are local | No read-path pattern needed |
| Q5 coupling | Async sync on reconnect | State Vector Sync for delta exchange |
Result: CRDT (YATA) + State Vector Sync. No locks, no 2PC, no Saga.
FR1 — Write Durable #
How do you ensure a write survives failure?
Append-only Log #
Every write is an append to an immutable, ordered log. Reads reconstruct state by replaying the log. Updates and deletes are new entries, not mutations.
- When: Event sourcing, audit trail, replication source, undo/redo
- Levers: Retention period, compaction policy, segment size
- Failure mode: Log grows unbounded without compaction; replay time increases with log depth
WAL + Checkpoint #
Write-Ahead Log: record intent before applying. Checkpoint: snapshot current state to bound replay cost. Recovery = last checkpoint + WAL replay.
- When: Any system needing crash recovery without full log replay (PostgreSQL, Flink, etcd)
- Levers: Checkpoint interval (shorter = faster recovery, more I/O); WAL sync mode (fsync vs group commit)
- Failure mode: Checkpoint too infrequent → long recovery; too frequent → write amplification
Event Sourcing #
Application state is never stored directly. Only events (facts) are stored. Current state = fold over event stream. Snapshots optionally truncate replay.
- When: Audit-critical domains (payments, orders, ledgers), CQRS read model rebuild, temporal queries
- Levers: Snapshot frequency; event schema versioning (schema evolution gap)
- Failure mode: Event schema changes break replay — requires upcasting or versioned handlers
FR2 — Coordinate Concurrency #
How do you prevent conflicting concurrent writes?
Pessimistic Lock #
Acquire exclusive lock before read-modify-write. Other writers block until lock released. Serializable by construction.
- When: Low-contention, short critical sections; correctness > throughput (inventory reservation, balance debit)
- Levers: Lock timeout; lock granularity (row vs table vs range)
- Failure mode: Deadlock (cycle in lock graph); lock timeout = cascading failure under load
Optimistic Lock (OCC) #
Read without locking. Write with version check: UPDATE ... WHERE version = $read_version. Retry on conflict. No locks held during think time.
- When: High-read, low-conflict workloads; long transactions; distributed systems where locks don’t compose
- Levers: Retry budget; backoff strategy; conflict rate (if > ~10%, OCC degrades to pessimistic)
- Failure mode: Starvation under high contention — writers keep losing the version race
CRDT (Conflict-free Replicated Data Type) #
Data structure with a merge function that is commutative, associative, and idempotent. Concurrent writes always merge without conflict. No coordination needed.
- When: Collaborative editing, distributed counters, shopping carts, presence systems
- Levers: CRDT type (G-Counter, LWW-Register, OR-Set, YATA sequence); tombstone GC policy
- Failure mode: Tombstone accumulation (deleted elements remain as metadata); eventual consistency means reads may lag
Saga #
Long-running transaction decomposed into a sequence of local transactions with compensating actions. If any step fails, run compensations in reverse order.
- When: Distributed transactions spanning multiple services where 2PC is too expensive or unavailable
- Levers: Choreography (event-driven) vs orchestration (central coordinator); compensation idempotency
- Failure mode: Compensation failure (“double fault”); intermediate states visible to concurrent readers
Two-Phase Commit (2PC) #
Phase 1 (prepare): coordinator asks all participants to lock and vote yes/no. Phase 2 (commit/abort): coordinator broadcasts decision. All-or-nothing across participants.
- When: Cross-shard transactions requiring atomicity; distributed databases; XA transactions
- Levers: Coordinator failure recovery (persistent prepare log); participant timeout
- Failure mode: Coordinator crashes after prepare, before commit → participants blocked (“in-doubt” state) until coordinator recovers
Compare-and-Swap (CAS) #
Atomic conditional update: write new value only if current value equals expected. Foundation for all lock-free data structures and leader election.
- When: Driver status claim in dispatch, leader election, optimistic concurrency without version columns
- Levers: Retry on CAS failure; ABA problem mitigation (add version/stamp)
- Failure mode: ABA problem: value changes A→B→A; CAS succeeds but state has semantically changed
Lease #
A time-bound exclusive claim on a resource. The holder has exclusive access for the duration; the lease expires automatically if the holder crashes, releasing the resource without explicit unlock.
- When: Distributed locks (Redis SETNX + EXPIRE), leader election (etcd lease), driver presence detection, primary shard ownership, any resource that must be exclusively held but safely released on crash
- Levers: Lease duration (shorter = faster recovery on crash, more renewal overhead); renewal interval (typically lease_duration / 3); fencing token (monotonically increasing generation number to reject stale lease holders)
- Distinction from Pessimistic Lock: Pessimistic Lock has no expiry — a dead holder blocks forever. Lease has expiry — a dead holder is evicted automatically. Lease = Pessimistic Lock + TTL + fencing token.
- Failure mode: Clock skew between holder and lease store causes premature expiry — holder believes it still owns the lease but the store has already granted it to another. Fix: use fencing tokens on every resource access, not wall-clock time
FR3 — Read Fast #
How do you serve reads with low latency at scale?
Cache-Aside (Lazy Loading) #
Application checks cache first. On miss, reads from DB, populates cache, returns result. Cache is a read-through acceleration layer.
- When: Read-heavy, write-tolerant workloads; cache can be stale briefly
- Levers: TTL; eviction policy (LRU, LFU); cache size
- Failure mode: Cache stampede on cold start or TTL expiry — many simultaneous misses flood the DB. Fix: probabilistic early expiry or request coalescing (single-flight)
CQRS + Materialized View #
Command Query Responsibility Segregation: write path and read path use separate models. Read model is a pre-computed, denormalized view maintained by consuming the write event stream.
- When: Read shape differs from write shape; high read:write ratio; multiple read models from one write model
- Levers: View update latency (sync vs async); view rebuild strategy on schema change
- Failure mode: Read model lag under write burst; view rebuild cost proportional to full event log
Denormalization #
Duplicate data across entities to avoid joins at read time. Embed related data at write time rather than join at query time.
- When: Join-heavy queries on hot paths; NoSQL stores without join capability
- Levers: Which fields to embed (high-read, low-churn); update fan-out cost (every embed must update on source change)
- Failure mode: Stale embedded data if update fan-out fails or is skipped
Scatter-Gather #
Fan a query out to N shards in parallel. Each shard returns a partial result. Coordinator merges partial results and returns the top-K or aggregated answer.
- When: Distributed search (Elasticsearch), distributed sort, global aggregations across shards
- Levers: Timeout for slow shard responses; partial result tolerance; merge cost
- Failure mode: Slow shard blocks the response (“long tail latency”). Fix: hedged requests; timeout with best-effort partial result
Spatial Partition #
Partition entities by geographic proximity using a hierarchical cell encoding. Queries become prefix lookups or neighbor enumeration in the cell hierarchy rather than range scans over continuous coordinates.
- When: Driver matching, surge zone aggregation, geo-search, proximity queries, delivery ETAs
- Implementations: Geohash (base-32 prefix), H3 (hexagonal hierarchical grid, Uber), S2 (spherical cells, Google), QuadTree (adaptive, good for non-uniform density)
- Levers: Cell resolution (smaller = precise, more boundary effects; larger = coarser, simpler neighbors); neighbor enumeration depth
- Failure mode: Cell boundary artifacts — entities near a cell edge are in different cells but physically adjacent. Fix: always query target cell + all neighboring cells
FR4 — Scale Writes #
How do you distribute write load across nodes?
Hash Partition (Consistent Hashing) #
Assign each key to a node by hash(key) mod N or via a consistent hash ring. Writes for a key always go to the same node (or its replicas).
- When: Horizontally scaling any key-value or document store; Kafka topic partitioning; DynamoDB
- Levers: Number of vnodes (virtual nodes) on the ring; rebalancing strategy on node add/remove
- Failure mode: Hot partition — high-cardinality keys all hash to one node. Fix: composite key with random suffix; adaptive capacity (DynamoDB)
Range Partition #
Divide keyspace into contiguous ranges. Each range assigned to a node. Enables efficient range scans at the cost of potential hotspots at range boundaries.
- When: Time-series data (partition by time range), ordered data, range queries are primary access pattern
- Levers: Range split threshold; split strategy (automatic vs manual); range merge on low load
- Failure mode: Write hotspot on the “latest” range for time-series data — all writes go to the current time partition. Fix: write to multiple partitions with time bucketing; use hash partition for write path and range for read path
Leaderless Replication (Dynamo-style) #
No designated leader. Any replica accepts writes. Quorum reads/writes (W + R > N) ensure overlap. Anti-entropy via gossip or Merkle tree sync.
- When: High availability > consistency; geographically distributed writes; Cassandra, DynamoDB, Riak
- Levers: N (replication factor), W (write quorum), R (read quorum); read repair vs background anti-entropy
- Failure mode: Sloppy quorum during partition: W+R may not overlap with actual current data. Fix: read repair; hinted handoff with bounded staleness window
FR5 — Events Flow #
How do you decouple producers from consumers and propagate state changes?
Message Queue #
Producer enqueues messages; consumer dequeues and processes exactly once. Queue provides buffering (C in circuit) and rate decoupling between producer and consumer.
- When: Work distribution, task offloading, rate smoothing between services
- Levers: Queue depth limit (backpressure); visibility timeout; dead-letter queue threshold
- Failure mode: Queue depth grows unbounded under sustained overload (capacitor overcharge). Fix: backpressure to producer; drop with DLQ; scale consumers
Pub/Sub #
Publisher emits events to a topic. Multiple subscribers each receive a copy. Fan-out is handled by the broker, not the publisher.
- When: Notification systems, event-driven microservices, real-time feeds
- Levers: Delivery guarantee (at-least-once vs exactly-once); subscriber filter expressions; retention period
- Failure mode: Slow subscriber blocks topic progress in some implementations. Fix: per-subscriber queue with independent offset tracking (Kafka model)
Outbox + Relay (Transactional Outbox) #
Write event to an outbox table in the same DB transaction as the business write. A relay process reads the outbox and publishes to the message broker. Guarantees at-least-once event publication without distributed transaction.
- When: Any service that must publish an event exactly when a DB write commits (payment confirmed → send email)
- Levers: Relay polling interval; outbox cleanup after ACK; relay idempotency key
- Failure mode: Relay falls behind under write burst → event delay. Fix: tail-based relay using WAL (CDC) instead of polling
CDC (Change Data Capture) #
Stream every row change from the DB write-ahead log to downstream consumers. No application-level outbox required.
- When: Real-time replication to read replicas, search index, cache invalidation, audit log
- Implementations: Debezium (Kafka Connector reading Postgres/MySQL WAL), DynamoDB Streams
- Levers: Lag tolerance; schema change handling in consumer; log retention on source DB
- Failure mode: Schema change on source table breaks CDC consumer — requires schema registry and versioned consumers
Fan-out on Write #
When an event occurs, immediately push it to all subscriber inboxes/feeds at write time. Read is O(1): just read your inbox.
- When: Social feeds with low follower counts; real-time notifications; Dropbox shared folder change notify
- Levers: Async vs sync fan-out; batch size; failure handling per recipient
- Failure mode: Celebrity problem — high-follower accounts make write fan-out O(followers) → write amplification. Fix: hybrid fan-out (fan-out on write for normal users, fan-out on read for celebrities)
Fan-out on Read (Pull on Read) #
Events stored once at the source. Each reader fetches and merges from all followed sources at read time. Write is O(1); read is O(sources).
- When: High-follower accounts; infrequently read feeds; storage-constrained systems
- Levers: Read cache TTL; merge strategy; number of sources per reader
- Failure mode: Read latency grows with number of followed sources. Fix: hybrid fan-out; pre-aggregated timeline cache with async refresh
FR6 — Tolerate Failure #
How do you make the system survive partial failures without cascading?
Idempotency Key #
Client generates a unique key per logical operation. Server stores key + result. On retry, server returns stored result instead of re-executing. Makes at-least-once delivery equivalent to exactly-once processing.
- When: Payment capture, order submission, any non-idempotent operation over unreliable network
- Levers: Key TTL; storage backend (Redis for speed, DB for durability); key scope (per-user vs global)
- Failure mode: Key collision if client reuses keys; key store becomes a hot path. Fix: UUIDv4 keys; async key cleanup
Retry + Backoff + Jitter #
On transient failure, retry after a delay. Exponential backoff increases delay geometrically. Jitter adds randomness to prevent synchronized retry storms (underdamped oscillation → reconnect storm).
- When: Any network call, S3 upload, DB connection — the default failure-tolerance pattern
- Levers: Base delay, max delay, multiplier, jitter range, max retry count
- Failure mode: Without jitter: synchronized retries spike load exactly when the server is recovering. Without max delay: retries queue indefinitely and exhaust client resources.
Circuit Breaker #
Track error rate over a sliding window. If error rate exceeds threshold, open the circuit: fail fast without calling the downstream. After a timeout, enter half-open state and probe with one request.
- When: Protecting against cascading failure when a dependency degrades; payment gateway isolation
- States: Closed (normal) → Open (fail fast) → Half-open (probe) → Closed
- Levers: Error rate threshold; window size; open duration; success count to close
- Failure mode: Miscalibrated threshold trips circuit on transient spike → legitimate traffic fails. Fix: use percentile-based (p99 latency) not raw error rate
Bulkhead #
Isolate resources (thread pools, connection pools, memory) by caller or service. Failure in one bulkhead doesn’t exhaust resources for others.
- When: Multi-tenant systems; high-priority vs low-priority traffic separation; protecting core path from batch jobs
- Pattern: Separate connection pools per downstream service; separate thread pools per tenant tier
- Failure mode: Under-provisioned bulkhead starves legitimate traffic. Fix: size bulkheads based on measured p99 concurrency, not peak
Timeout #
Every outbound call has a maximum wait time. Caller does not block indefinitely. Timed-out requests are abandoned and counted as errors.
- When: Every network call — the baseline fault isolation primitive
- Levers: Timeout value (must be less than the caller’s own timeout → timeout budget propagation)
- Failure mode: Timeout too long → slow calls hold resources, cascade. Timeout too short → false failures on slow-but-healthy responses. Fix: measure p99 latency, set timeout at ~2–3× p99
FR7 — Nodes Agree #
How do distributed nodes reach consensus or stay in sync?
Leader-Follower (Primary-Replica) #
One node (leader) accepts all writes. Followers replicate from the leader. Reads can go to followers (with staleness). Leader failure triggers election.
- When: Single-region databases; Kafka partition leadership; Redis Sentinel
- Levers: Replication mode (sync = no data loss, async = lower latency); election timeout; follower lag threshold
- Failure mode: Split-brain — both old and new leader accept writes. Fix: fencing token + quorum acknowledgment before stepping down
Quorum #
A write is durable once W of N nodes acknowledge. A read fetches from R of N nodes. Overlap (W + R > N) guarantees at least one node has the latest version.
- When: Leaderless replication (Cassandra, DynamoDB), Raft log commit, distributed consensus
- Levers: N, W, R values; tunable consistency (QUORUM vs ONE vs ALL)
- Failure mode: Network partition: if partition isolates W nodes on one side and R nodes on the other, quorum may succeed on both sides independently → split-brain
Gossip Protocol #
Nodes periodically exchange state with random peers. Information spreads exponentially (like an epidemic). Convergence time = O(log N).
- When: Membership management (Cassandra ring membership), failure detection, configuration propagation, distributed counters
- Levers: Fanout (peers per round); gossip interval; anti-entropy via Merkle tree comparison
- Failure mode: Slow convergence under high churn; “false negative” failure detection if gossip packets drop. Fix: suspicion mechanism before declaring node dead
State Vector Sync (Version Vectors) #
Each node maintains a vector of (node_id → sequence_number) representing the latest event seen from each node. Two nodes compare vectors to identify what each is missing and exchange only the delta.
- When: Distributed sync (Dropbox, CRDTs, Dynamo-style conflict detection), collaborative editing
- Levers: Vector clock vs hybrid logical clock; delta encoding for large vectors
- Failure mode: Vector size grows with number of nodes; stale vectors after node removal leave dangling entries. Fix: dotted version vectors; periodic compaction
Merkle Tree #
Hash tree where each leaf is a block of data and each internal node is a hash of its children. Two nodes compare root hashes; if equal, data is identical. If different, binary search down the tree to find divergent ranges.
- When: Efficient anti-entropy between replicas (Cassandra, Dynamo), blockchain integrity, Git objects, S3 multi-part checksum
- Levers: Tree depth (log₂ N levels); leaf block size; rehash cost on update
- Failure mode: Hot write path invalidates Merkle tree root on every write → expensive rehash. Fix: batch updates before rehashing; async background Merkle tree rebuild
FR8 — Time and Approximation #
How do you reason about time, handle late data, and approximate at scale?
Windowing #
Divide an infinite event stream into finite, bounded chunks for aggregation. Three types: tumbling (non-overlapping fixed intervals), sliding (overlapping intervals), session (gap-based).
- When: Metrics aggregation, rate limiting, analytics, surge demand calculation
- Levers: Window size; slide interval (sliding); session gap timeout; allowed lateness
- Failure mode: Late events arrive after window closes → missed from aggregation. Fix: allowed lateness with watermark; side output for late events
TTL (Time-to-Live) #
Entries expire automatically after a fixed duration. Expiry is handled by the store (Redis, DynamoDB, DNS TTL) without application logic.
- When: Cache expiry, session invalidation, driver presence detection, version retention cleanup, DNS caching
- Levers: TTL duration; whether expiry is lazy (on access) or eager (background sweep)
- Failure mode: TTL too short → excessive cache misses. TTL too long → stale data served. Jitter on TTL prevents cache stampede (add ±10% randomness).
Approximate Counting #
Use probabilistic data structures instead of exact counts when memory or coordination cost of exact counting is prohibitive.
- Structures: Count-Min Sketch (frequency estimation, O(ε) error), HyperLogLog (cardinality estimation, ~1.6% error, ~1.5KB for 10⁹ elements), Bloom Filter (membership test, no false negatives)
- When: Top-K trending, unique visitor counts, spam filter membership, cache existence check before DB query
- Levers: Error tolerance ε; confidence level δ; hash function count
- Failure mode: Count-Min saturation — heavy hitters pollute frequency estimates for rare items. Fix: Count-Min with conservative update; separate heavy-hitter tracking
Staged Rollout #
Deploy a change to a small percentage of users/servers, monitor, expand incrementally. Limits blast radius of bugs.
- When: Feature releases, infrastructure migrations, ML model updates, client app releases
- Levers: Rollout percentage schedule; metric thresholds for automatic pause/rollback; canary population selection
- Failure mode: Canary population is unrepresentative (e.g., internal users only) → issue missed until full rollout. Fix: random sampling from production traffic; monitor tail latency not just error rate
Scheduled Trigger #
A periodic job that runs at fixed intervals to perform maintenance, aggregation, or cleanup that would be too expensive inline.
- When: Leaderboard snapshot, billing cycle, report generation, cache warming, anti-entropy reconciliation
- Levers: Schedule frequency; idempotency of job (must be safe to re-run); job overlap prevention (distributed lock)
- Failure mode: Job overlap when previous run exceeds schedule interval. Fix: distributed lock with TTL; skip if lock not acquired; alert on overlap
Temporal Decay #
Score or weight that decreases automatically as a function of time elapsed since the event. Ensures recent events dominate rankings without requiring explicit deletion of old entries.
- When: Trending topics (Reddit hot sort), recommendation recency weighting, fraud signal decay, YouTube Top K video ranking
- Formula examples: Reddit:
score / (age_hours + 2)^gravity; Exponential:score × e^(-λt) - Levers: Decay rate λ; time granularity; floor value (minimum score to prevent numerical underflow)
- Failure mode: Decay too fast → viral content disappears before it surfaces. Decay too slow → old viral content permanently occupies top-K slots. Fix: tune λ against measured content half-life in your domain
Composition Rules #
Serial Composition #
Patterns chain output-to-input: Outbox → Message Queue → CDC → Materialized View. Each adds a guarantee or capability. Latency adds; throughput is bounded by the slowest stage.
Parallel Composition #
Patterns run side-by-side: Hash Partition applied simultaneously to write path and read path. Throughput multiplies; correctness requires each path to be independently consistent.
XOR Composition #
Choose one pattern or another based on a condition: Fan-out on Write for normal users, Fan-out on Read for celebrities. The hybrid is a selector, not a combination.
Known Gaps #
| Gap | Why Out of Scope |
|---|---|
| Schema Evolution / Expand-Contract | Needed for live schema migrations; domain-specific to relational stores |
| Spatial Partition — 3D / indoor | H3/S2 handle Earth surface; 3D (drone routing, indoor navigation) requires different primitives |
| Byzantine Fault Tolerance (BFT) | Required for blockchain/consensus under adversarial nodes; out of scope for consumer distributed systems |
There's no articles to list here yet.