Role-Based Failure And Mitigation Phrase Sheet
Role-Based Failure And Mitigation Phrase Sheet #
Status: Archive candidate. Keep as historical reference; prefer system-design-core-index.md and the core notes for day-to-day use.
Phrase-sheet derivative; not part of the primary framework.
Use this when you want to generate failure modes systematically instead of memorizing case-by-case lists.
The method:
- classify the component role
- run the role’s failure phrases
- instantiate them in domain language
- pair each with one mitigation phrase
- instantiate the mitigation in domain language
Core Table #
| Role | Failure Phrases | Mitigation Phrases | Example Archetypes |
|---|---|---|---|
Truth | written twice, partially committed, concurrently overwritten, served stale, acknowledged before durable | deduplicated before write, committed atomically, guarded by version, read from authority, ack after durability | A01, A04, A06, A07, parts of A18 |
Derived | propagated late, not propagated, applied twice, tombstone missed, rebuilt stale | lag bounded, replayed from truth, applied idempotently, tombstones propagated, rebuilt from authority | A05, A13, A15, caches, dashboards, indexes |
Coordination | granted twice, stale owner accepted, expiry lagged, release missed, progress stalled | granted conditionally, fenced by epoch, expired by lease, reclaimed by sweeper, re-driven by timeout | A08, A09, A10, A12, I01, I02, I15 |
Immutable | published incomplete, misreferenced, collected early, lost after publish, mutated after publish | published after validation, addressed by identity, collected after reachability, replicated before publish, made write-once | A18, I14, manifests, blobs, immutable versions |
External Effect | fired twice, fired unrecorded, recorded unfired, retried after success, dependency stalled | receiver made idempotent, effect recorded durably, delivered from outbox, reconciled after uncertainty, timed out and retried safely | A07, A04, I12, webhooks, emails, payments |
Cross-Cutting Modifiers #
These are not component roles. They are recurring correctness shapes that cut across roles and explain the failure vocab the role table does not capture cleanly by itself.
Use them after the role phrases when the failure is mainly about time boundaries, accounting invariants, uncertainty, or classification rather than one component’s local behavior.
| Modifier | Failure Phrases | Mitigation Phrases |
|---|---|---|
Ordering / Boundary | accepted too late [expiry/close race], observed in the wrong order [reordering], resumed from the wrong point [watch gap/checkpoint gap], finalized before all prior work landed [premature finalize] | fenced by version or epoch [version/epoch fence], closed by explicit boundary rule [cutoff/finalize rule], resumed from durable cursor [durable cursor], sequenced before publish [ordered commit] |
Conservation / Accounting | counted twice [double spend/double allocate], failed to release or subtract [leak/drift], admitted more than the budget allows [over-admit/oversell], aggregate no longer matches underlying truth [count drift] | conserved by atomic update [atomic conservation], reconciled against source truth [reconciliation], bounded by guarded decrement [guarded reserve/debit], rebuilt from authoritative ledger or inventory [authoritative recompute] |
Ambiguity / Uncertainty | may have succeeded but not been observed [ack ambiguity], may have failed but left side effects [uncertain outcome], retried without knowing prior result [retry ambiguity] | moved into explicit uncertain state [pending/uncertain state], reconciled against external truth [provider/status reconciliation], made retry return the prior result [idempotent retry lookup] |
Classification / Detection | declared alive when it is not [false liveness], declared dead when it is not [false death], conflict or duplicate was not detected [missed conflict], stale actor was treated as current [stale identity accepted] | judged by stronger signal [heartbeat quorum/fencing], delayed irreversible action until confidence rises [suspicion/confirmation window], detected by version/hash/natural key [conflict detector], rejected by current epoch or owner token [owner/epoch validation] |
Representation Drift | control truth and applied truth disagree [truth-vs-applied drift], source and projection disagree [projection drift], manifest or pointer disagrees with content [pointer mismatch] | compared against source periodically [drift audit], versioned across both sides [version handshake], repaired from authoritative source [repair/rebuild from truth] |
How the Role Table and Modifiers Work Together #
Use both dimensions together:
roletells you what kind of component is failingmodifiertells you what correctness shape is being violated
Examples:
Coordination+accepted too late[expiry/close race]- confirm arrives after the hold expired
Truth+admitted more than the budget allows[over-admit/oversell]- inventory or token accounting exceeds the real limit
External Effect+may have succeeded but not been observed[ack ambiguity]- provider captured payment but callback or ack was lost
Coordination+declared dead when it is not[false death]- membership system ejects a healthy node during a network blip
Immutable+manifest or pointer disagrees with content[pointer mismatch]- namespace head or manifest points at the wrong blob
Example Instantiations #
HoldState in A08 #
- role:
Coordination - failure phrase:
granted twice - mitigation phrase:
granted conditionally
Instantiated:
- failure: two users get active holds on one seat
- mitigation: conditional insert or unique active-hold constraint
NamespaceState in A18 #
- role:
Truth - failure phrase:
concurrently overwritten - mitigation phrase:
guarded by version
Instantiated:
- failure: two devices race to advance file head
- mitigation: CAS on current version id / namespace version
BlobState in A18 #
- role:
Immutable - failure phrase:
published incomplete - mitigation phrase:
published after validation
Instantiated:
- failure: manifest/head points to missing chunk
- mitigation: publish namespace head only after chunk existence check
Projection row in A05 #
- role:
Derived - failure phrase:
tombstone missed - mitigation phrase:
tombstones propagated
Instantiated:
- failure: deleted post still appears in feed
- mitigation: explicit delete event propagation plus repair sweep
Webhook delivery in I12 #
- role:
External Effect - failure phrase:
retried after success - mitigation phrase:
receiver made idempotent
Instantiated:
- failure: same webhook processed twice after ack ambiguity
- mitigation: idempotency key at receiver
Fast Role Heuristics #
Truth- authoritative row/object/state machine
- if corrupted, business truth is corrupted
Derived- projection, index, cache, dashboard, read model
- rebuildable from truth
Coordination- lease, claim, leader, sequencer, assignment, scheduling ownership
- decides who may act next
Immutable- version, blob, manifest, artifact, chunk, immutable segment
- should never change after publish
External Effect- webhook, email, SMS, payment capture, third-party callback
- side effect outside the core truth store
Short Form #
Truth-> write/commit/read/durability failuresDerived-> propagation/rebuild/visibility failuresCoordination-> ownership/order/expiry/progress failuresImmutable-> publish/reference/reachability failuresExternal-> send/record/ack/retry failures
Usage Rule #
Do not start from the full archetype.
Start from the component role.
Then instantiate:
- what would this phrase mean here?
- what would the matching mitigation mean here?
This is faster and more generative than memorizing archetype-specific failure lists.