A08 Failure Workbook #
This workbook turns the default A08 Time-Bounded Exclusive Allocation failure modes into tiny TLA+ bug models.
Use them to internalize:
- what the failure really is
- whether it is a
safetybug or alivenessbug - what invariant should catch it
- what control prevents it
Baseline #
Checked baseline:
This is the “good” model.
Failure scenarios #
1. Duplicate hold #
Files:
Bug:
- create hold does not check whether another active hold already exists
Invariant:
AtMostOneActiveHold
Insight:
- this is a pure
safetybug - the prevention is a guarded exclusive create / reservation
- if this fails, no amount of later repair can recover cleanly without compensation
2. Expired hold not released #
Files:
Bug:
- there is no expiry transition at all
Invariant:
NoPastDueActiveHold
Insight:
- this looks like a
safetyproperty in the tiny model, but in real systems it is mostly alivenessproblem - the real question is usually:
- expired holds must be
eventuallyreleasable
- expired holds must be
- that means scanners/reapers/checkpoint discipline matter
3. Confirm after expiry #
Files:
Bug:
- confirm ignores expiry
Invariant:
NoLateConfirm
Insight:
- this is the core time-boundary bug in
A08 - wall clock passing is not enough; commit path must check authoritative expiry truth
- prevention is the confirm guard on current hold state and expiry boundary
4. Cancel / confirm race #
Files:
Bug:
- confirm ignores the fact that the hold was already cancelled
Invariant:
CancelledHoldCanNeverConfirm
Insight:
- this is really a stale-transition bug
- prevention is a guarded state machine transition
- once a terminal or conflicting transition happens, later transitions must be rejected
5. Stale availability after hold expiry #
Files:
Bug model:
- truth changes first on expiry
- availability view catches up later
Invariant:
ViewMatchesTruthImmediately
Insight:
- this is intentionally too strong and should fail
- this scenario is not primarily a truth corruption bug
- it is a
derived-state lagbug - the real system rule is:
- browse may be stale
- confirm path must still use truth
Overall lessons #
A08 mixes safety and liveness #
Safety-heavy:
- duplicate hold
- confirm after expiry
- cancel/confirm race
Liveness-heavy:
- expired hold not released
Derived-state/refinement:
- stale availability after expiry
The main prevention controls #
- guarded exclusive hold creation
- guarded confirm transition
- guarded cancel/expire transition
- expiry worker or reaper
- commit path reads source truth, not stale browse view
Why this archetype is so useful in TLA+ #
Because the bugs are easy to explain badly in prose, but small models make them precise fast.