Skip to main content
  1. System Design Components/

A09 Failure Workbook #

This workbook turns the default A09 Future Constraint + Claimable Run failures into tiny TLA+ models.

Baseline:

Bug scenarios:

Main insights #

1. Due release is not just a timer #

The bug is:

  • work becomes runnable before dueAt

What catches it:

  • NoEarlyRunnable

Lesson:

  • a timer event is only a trigger
  • authoritative state still has to check the due boundary

2. Materialization must mutate truth or progress #

The bug is:

  • a logical run is made runnable, but the source schedule/progress still says it remains outstanding

What catches it:

  • MaterializedRunRemovedFromSchedule

Lesson:

  • if due release is not coupled to durable progress, duplicate runnable creation is inevitable

3. Completion must be fenced by attempt epoch #

The bug is:

  • completion is accepted from a stale attempt generation

What catches it:

  • CompletedUsesCurrentEpoch

Lesson:

  • claimable execution is not safe with just “claimed/running” state
  • completion must prove it belongs to the current attempt

4. Claim must remain exclusive #

The bug is:

  • more than one worker can believe it owns the same runnable unit

What catches it:

  • AtMostOneClaimer

Lesson:

  • claim is not just a state label
  • it is an exclusivity contract
  • if a second worker can still attach itself, duplicate execution is already possible

5. Checkpoint must lag durable output, not lead it #

The bug is:

  • scanner advances progress before durable runnable creation

What catches it:

  • CheckpointAfterDurableMaterialization

Lesson:

  • this is the core checkpoint discipline in A09
  • a due scanner behaves like a log consumer:
    • produce durable output first
    • advance progress marker second

6. Missing materialization is a liveness-shaped bug #

The bug is:

  • due work remains scheduled even after the due boundary passes

What catches it:

  • NoOverdueScheduled

Lesson:

  • in a tiny bounded model this appears as a safety-style invariant
  • conceptually it is standing in for the real liveness requirement:
    • overdue work must eventually be materialized