A09 Failure Workbook #
This workbook turns the default A09 Future Constraint + Claimable Run failures into tiny TLA+ models.
Baseline:
Bug scenarios:
- A09DuplicateMaterializationBug.tla
- A09EarlyReleaseBug.tla
- A09StaleCompletionBug.tla
- A09DuplicateClaimBug.tla
- A09CheckpointAdvanceBug.tla
- A09MissingMaterializationBug.tla
Main insights #
1. Due release is not just a timer #
The bug is:
- work becomes runnable before
dueAt
What catches it:
NoEarlyRunnable
Lesson:
- a timer event is only a trigger
- authoritative state still has to check the due boundary
2. Materialization must mutate truth or progress #
The bug is:
- a logical run is made runnable, but the source schedule/progress still says it remains outstanding
What catches it:
MaterializedRunRemovedFromSchedule
Lesson:
- if due release is not coupled to durable progress, duplicate runnable creation is inevitable
3. Completion must be fenced by attempt epoch #
The bug is:
- completion is accepted from a stale attempt generation
What catches it:
CompletedUsesCurrentEpoch
Lesson:
- claimable execution is not safe with just “claimed/running” state
- completion must prove it belongs to the current attempt
4. Claim must remain exclusive #
The bug is:
- more than one worker can believe it owns the same runnable unit
What catches it:
AtMostOneClaimer
Lesson:
- claim is not just a state label
- it is an exclusivity contract
- if a second worker can still attach itself, duplicate execution is already possible
5. Checkpoint must lag durable output, not lead it #
The bug is:
- scanner advances progress before durable runnable creation
What catches it:
CheckpointAfterDurableMaterialization
Lesson:
- this is the core checkpoint discipline in
A09 - a due scanner behaves like a log consumer:
- produce durable output first
- advance progress marker second
6. Missing materialization is a liveness-shaped bug #
The bug is:
- due work remains scheduled even after the due boundary passes
What catches it:
NoOverdueScheduled
Lesson:
- in a tiny bounded model this appears as a safety-style invariant
- conceptually it is standing in for the real liveness requirement:
- overdue work must eventually be materialized