Skip to main content
  1. System Design Components/

Role-Based Failure And Mitigation Phrase Sheet

Role-Based Failure And Mitigation Phrase Sheet #

Status: Archive candidate. Keep as historical reference; prefer system-design-core-index.md and the core notes for day-to-day use.

Phrase-sheet derivative; not part of the primary framework.

Use this when you want to generate failure modes systematically instead of memorizing case-by-case lists.

The method:

  1. classify the component role
  2. run the role’s failure phrases
  3. instantiate them in domain language
  4. pair each with one mitigation phrase
  5. instantiate the mitigation in domain language

Core Table #

RoleFailure PhrasesMitigation PhrasesExample Archetypes
Truthwritten twice, partially committed, concurrently overwritten, served stale, acknowledged before durablededuplicated before write, committed atomically, guarded by version, read from authority, ack after durabilityA01, A04, A06, A07, parts of A18
Derivedpropagated late, not propagated, applied twice, tombstone missed, rebuilt stalelag bounded, replayed from truth, applied idempotently, tombstones propagated, rebuilt from authorityA05, A13, A15, caches, dashboards, indexes
Coordinationgranted twice, stale owner accepted, expiry lagged, release missed, progress stalledgranted conditionally, fenced by epoch, expired by lease, reclaimed by sweeper, re-driven by timeoutA08, A09, A10, A12, I01, I02, I15
Immutablepublished incomplete, misreferenced, collected early, lost after publish, mutated after publishpublished after validation, addressed by identity, collected after reachability, replicated before publish, made write-onceA18, I14, manifests, blobs, immutable versions
External Effectfired twice, fired unrecorded, recorded unfired, retried after success, dependency stalledreceiver made idempotent, effect recorded durably, delivered from outbox, reconciled after uncertainty, timed out and retried safelyA07, A04, I12, webhooks, emails, payments

Cross-Cutting Modifiers #

These are not component roles. They are recurring correctness shapes that cut across roles and explain the failure vocab the role table does not capture cleanly by itself.

Use them after the role phrases when the failure is mainly about time boundaries, accounting invariants, uncertainty, or classification rather than one component’s local behavior.

ModifierFailure PhrasesMitigation Phrases
Ordering / Boundaryaccepted too late [expiry/close race], observed in the wrong order [reordering], resumed from the wrong point [watch gap/checkpoint gap], finalized before all prior work landed [premature finalize]fenced by version or epoch [version/epoch fence], closed by explicit boundary rule [cutoff/finalize rule], resumed from durable cursor [durable cursor], sequenced before publish [ordered commit]
Conservation / Accountingcounted twice [double spend/double allocate], failed to release or subtract [leak/drift], admitted more than the budget allows [over-admit/oversell], aggregate no longer matches underlying truth [count drift]conserved by atomic update [atomic conservation], reconciled against source truth [reconciliation], bounded by guarded decrement [guarded reserve/debit], rebuilt from authoritative ledger or inventory [authoritative recompute]
Ambiguity / Uncertaintymay have succeeded but not been observed [ack ambiguity], may have failed but left side effects [uncertain outcome], retried without knowing prior result [retry ambiguity]moved into explicit uncertain state [pending/uncertain state], reconciled against external truth [provider/status reconciliation], made retry return the prior result [idempotent retry lookup]
Classification / Detectiondeclared alive when it is not [false liveness], declared dead when it is not [false death], conflict or duplicate was not detected [missed conflict], stale actor was treated as current [stale identity accepted]judged by stronger signal [heartbeat quorum/fencing], delayed irreversible action until confidence rises [suspicion/confirmation window], detected by version/hash/natural key [conflict detector], rejected by current epoch or owner token [owner/epoch validation]
Representation Driftcontrol truth and applied truth disagree [truth-vs-applied drift], source and projection disagree [projection drift], manifest or pointer disagrees with content [pointer mismatch]compared against source periodically [drift audit], versioned across both sides [version handshake], repaired from authoritative source [repair/rebuild from truth]

How the Role Table and Modifiers Work Together #

Use both dimensions together:

  1. role tells you what kind of component is failing
  2. modifier tells you what correctness shape is being violated

Examples:

  • Coordination + accepted too late [expiry/close race]
    • confirm arrives after the hold expired
  • Truth + admitted more than the budget allows [over-admit/oversell]
    • inventory or token accounting exceeds the real limit
  • External Effect + may have succeeded but not been observed [ack ambiguity]
    • provider captured payment but callback or ack was lost
  • Coordination + declared dead when it is not [false death]
    • membership system ejects a healthy node during a network blip
  • Immutable + manifest or pointer disagrees with content [pointer mismatch]
    • namespace head or manifest points at the wrong blob

Example Instantiations #

HoldState in A08 #

  • role: Coordination
  • failure phrase: granted twice
  • mitigation phrase: granted conditionally

Instantiated:

  • failure: two users get active holds on one seat
  • mitigation: conditional insert or unique active-hold constraint

NamespaceState in A18 #

  • role: Truth
  • failure phrase: concurrently overwritten
  • mitigation phrase: guarded by version

Instantiated:

  • failure: two devices race to advance file head
  • mitigation: CAS on current version id / namespace version

BlobState in A18 #

  • role: Immutable
  • failure phrase: published incomplete
  • mitigation phrase: published after validation

Instantiated:

  • failure: manifest/head points to missing chunk
  • mitigation: publish namespace head only after chunk existence check

Projection row in A05 #

  • role: Derived
  • failure phrase: tombstone missed
  • mitigation phrase: tombstones propagated

Instantiated:

  • failure: deleted post still appears in feed
  • mitigation: explicit delete event propagation plus repair sweep

Webhook delivery in I12 #

  • role: External Effect
  • failure phrase: retried after success
  • mitigation phrase: receiver made idempotent

Instantiated:

  • failure: same webhook processed twice after ack ambiguity
  • mitigation: idempotency key at receiver

Fast Role Heuristics #

  • Truth

    • authoritative row/object/state machine
    • if corrupted, business truth is corrupted
  • Derived

    • projection, index, cache, dashboard, read model
    • rebuildable from truth
  • Coordination

    • lease, claim, leader, sequencer, assignment, scheduling ownership
    • decides who may act next
  • Immutable

    • version, blob, manifest, artifact, chunk, immutable segment
    • should never change after publish
  • External Effect

    • webhook, email, SMS, payment capture, third-party callback
    • side effect outside the core truth store

Short Form #

  • Truth -> write/commit/read/durability failures
  • Derived -> propagation/rebuild/visibility failures
  • Coordination -> ownership/order/expiry/progress failures
  • Immutable -> publish/reference/reachability failures
  • External -> send/record/ack/retry failures

Usage Rule #

Do not start from the full archetype.

Start from the component role.

Then instantiate:

  • what would this phrase mean here?
  • what would the matching mitigation mean here?

This is faster and more generative than memorizing archetype-specific failure lists.