Transaction #
transaction = a set of operations treated as one logical change
It answers:
what must happen together, or not at all?
Role in the catalog: the consistency boundary, elaborated — and the atomicity-domain factory. boundary.md listed the consistency boundary as type #10 and never developed it; this file is that development. And retry_idempotency.md CONSUMES atomicity domains (“put the marker inside one”); this block is where domains are MANUFACTURED. The homecomings are heavy:
single-object CAS → retry_idempotency's conditional rung
idempotent transaction → retry_idempotency, whole
transactional outbox → the atomicity domain's flagship, at home there
saga → retry axis 4 (compensation ≠ undo) +
state_machine's compensation edges
read-only transaction → snapshot.md, whole (the one-door read view)
ledger transaction → log.md's ledger seat (append-time invariants)
2PC's Unknown → state_machine's famous ignorance state;
quorum.md already wrote the Spanner cure
(make the coordinator immortal)
Central tension:
strong correctness and simple invariants
vs
contention, latency, availability, and coordination cost
Design Axes (the core module) #
Axis 1 — The Invariant Scope (the structural cleave) #
The doc’s first Big Question, promoted:
THE TRANSACTION BOUNDARY IS THE INVARIANT BOUNDARY.
a transaction exists to protect an invariant that spans multiple
operations; everything else is derived machinery.
This is shard.md’s axis 2 seen from the other side:
sharding asks: does the expensive invariant land inside a slice?
transactions are what you PAY when it doesn't.
Interrogation:
Name the invariant. (no invariant → no transaction needed → don't pay)
Does it span objects? shards? systems? (the answer picks axis 4's scope)
Could the DATA MODEL move the invariant inside one object/slice —
making the transaction trivial? (the cheapest transaction is the one
the schema made unnecessary)
Axis 2 — Conflict Strategy (genuinely native) #
optimistic: execute first, validate at commit —
pay in retry storms under contention, plus the
body-effect trap (retry_idempotency: the abort-retry
loop re-executes the body; effects outside or guarded)
pessimistic: lock before mutation —
pay in deadlocks, convoys, priority inversion,
and lock-hold time that scales with the slowest thing
inside the critical section
Governed by one variable — expected contention:
low contention → optimistic (validation almost always passes;
no lock bookkeeping)
hot rows → locks... or neither: the star*'s recipes restructure
the invariant instead
Interrogation:
What is the actual conflict rate — measured, not vibed?
Optimistic: what's in the retry body besides database writes?
Pessimistic: what's the lock ordering discipline, and who audits it?
(deadlock = a cycle in lock acquisition; ordering makes cycles
impossible by construction — one-sided engineering, index_structures.md's lesson)
Axis 3 — The Isolation Ladder (the eighth strength ladder) #
Each rung buys anomaly-immunity with concurrency:
read uncommitted sees the provisional (log.md's tail, leaking)
read committed per-statement coherence; write-between-statements visible
snapshot isolation one-door view per transaction (snapshot.md) —
with THE FAMOUS GAP, seated below
serializable as if one-at-a-time; the anomalies are gone,
and so is some of your throughput
The gap: write skew. Two transactions, each preserving the invariant LOCALLY, jointly violating it — because neither wrote what the other read:
SI validates write-write conflicts;
the invariant lived in the READ-WRITE cross.
canonical form: two on-call doctors, each checks "≥2 on call,"
each removes themselves, both commit — zero on call.
"snapshot isolation ≠ serializability" is this gap, and SSI
(serializable SI) is the recipe: track the read-write edges too.
Interrogation:
Which rung does each transaction actually run at — per connection,
checked, or assumed from the default?
Does any invariant live in the read-write cross? (constraints checked
by SELECT then enforced by UPDATE elsewhere = write-skew bait;
materialize the constraint into a row both must write, or go SSI)
Axis 4 — Commit Scope (the atomicity-domain factory) #
retry_idempotency’s star, PRODUCER-side — where domains are built, in ascending price:
single object CAS — the domain is the object itself
single node WAL — the domain is one log (log.md's flush-then-ack)
multi-shard 2PC over consensus groups — Spanner MANUFACTURES a
domain across shards by making every participant's
vote and the decision immortal (quorum.md's cure for
the Unknown state); the price is cross-shard latency
and the coordinator round-trips
cross-system NO DOMAIN SPANS IT — and saga is the honest admission:
a sequence of local domains plus compensation, with
intermediate states VISIBLE because atomicity was
never on offer (retry axis 4: compensation is a
forward action; the world saw the middle)
Interrogation:
Which scope does the invariant actually require? (paying multi-shard
prices for a single-slice invariant is shard.md's axis-2 failure,
inverted)
Bare 2PC: where does the decision live when the coordinator dies?
(nowhere = the Unknown state = the cautionary baseline)
Cross-system: is everyone honest that this is a saga — intermediate
states named, compensations written, idempotent, and tested?
Axis 5 — Durability and Visibility (two doors, both owned) #
durable: the commit RECORD is one bit in a log — flush-before-ack
(log.md's ladder: "commit request ≠ durable commit" is
rung 1 vs rung 2/3)
visible: new state enters the world ATOMICALLY — snapshot.md's
one-door discipline: a version publish, never piecewise
(readers see pre-state or post-state; a reader seeing the
middle is torn visibility*)
Interrogation:
What is the commit record, and what fsync guards it?
How do readers cross from old to new — one pointer/version, or hope?
Technical Bottleneck: The Hot Invariant* #
every transaction system ultimately SERIALIZES conflicting access
to the same invariant. the invariant everyone touches — the balance,
the counter, the inventory count — becomes the serialization point
no isolation scheme fixes:
optimistic retries STORM on it; pessimistic locks CONVOY on it.
Essential, no general solution — and the recipes all restructure the INVARIANT rather than the transaction:
split it sharded counters — skew*'s salting recipe, applied
to invariants (N sub-counters, read-side sum)
commute it increment, not read-modify-write — operations that
commute don't conflict (the CRDT algebra's third
catalog appearance: replication axis 3,
retry's rung 1, here)
escrow it pre-partition the quantity into reservations —
inventory holds, TigerBeetle-style transfers:
the one big invariant becomes many small LOCAL ones
(capacity.md's reservation lease, as a concurrency
recipe)
serialize it, deliberately a single writer owns the hot spot — the
contention point becomes a LOG, and logs are fast
at exactly this (log.md: one appender, total order,
no conflict detection needed at all)
The one-liner:
contention is not a bug in the transaction system.
it is the invariant's bill — payable in retries, locks, or redesign.
A strong design says explicitly:
the invariant, by name (axis 1),
the conflict strategy the measured contention justifies (axis 2),
the isolation rung, checked not assumed — and whether write skew
can reach the invariant (axis 3),
the commit scope the invariant requires and no more (axis 4),
the commit record and the one visibility door (axis 5),
and for the hot invariant: which restructuring pays its bill.
Transaction As Protocol (the crossing-point spec — keep) #
Optimistic instantiation:
read at version (snapshot.md's coordinate)
track read/conflict ranges
buffer writes
commit validates no conflicting writes occurred
valid → assign commit version, publish (one door)
conflict → abort, retry the WHOLE body (effects outside — retry_idempotency)
2PC instantiation (the cautionary baseline):
coordinator: prepare? → participants durably PROMISE (state_machine:
the Prepared state is entered, and with a dead coordinator it is
the Unknown state)
coordinator records decision → participants learn → complete
recovery = consulting the decision's home
(Spanner's cure: the decision's home is a Paxos group — quorum.md)
Saga instantiation:
execute local transaction (a real domain, small)
record progress (checkpoint_replay's coordinate)
next step...
on failure: compensate completed steps — forward actions, idempotent,
in reverse order (retry axis 4; state_machine's compensation edges)
record final state (an absorbing one — state_machine's terminal test)
Named Configurations (lookup table) #
Vector = {invariant scope, strategy, isolation, commit scope, hot-spot exposure}. Rows marked → are owned elsewhere.
| Name | Vector | Canonical study object | Signature failure |
|---|---|---|---|
| Single-object txn → retry rung 2 | one object, CAS, —, object domain, per-key | etcd Txn; DynamoDB conditional | lost update without the condition; ABA on non-monotonic version |
| Multi-object txn | multi-key invariant, either, SI+, node or shard, the star’s home | FoundationDB model | write skew; conflict aborts; unknown outcome (ignorance*) |
| ACID (single node) | node-local, MVCC+locks, ladder rungs, WAL domain, — | Postgres MVCC + WAL | isolation rung assumed not checked; long txn blocks vacuum (snapshot axis 3); commit record not flushed |
| Optimistic | any, validate-at-commit, SI/SSI, —, retry storms on heat | FoundationDB | contention storms*; effects in the retry body; conflict ranges mis-scoped |
| Pessimistic | any, lock-first, 2PL, —, convoys on heat | SELECT FOR UPDATE; 2PL | deadlock (no ordering discipline); convoy*; priority inversion |
| 2PC (bare) | cross-participant, —, —, manufactured domain, mortal coordinator, — | XA / 2PC as caution | coordinator dies in Prepared → Unknown; heuristic decisions break atomicity |
| Replicated distributed txn | cross-shard, optimistic-ish, serializable, 2PC over Paxos groups, timestamp machinery | Spanner + TrueTime | cross-shard latency; timestamp uncertainty windows; still the star* under heat |
| Saga → retry axis 4 | cross-SYSTEM, local domains + compensation, none globally, no spanning domain, — | Temporal saga | compensation fails/isn’t inverse; intermediate state seen (by design — say so) |
| Idempotent txn → retry_idempotency | operation identity, rung 4/5, —, marker inside the domain, — | Stripe keys around txns | (owned: marker placement, scope, expiry) |
| Read-only txn → snapshot.md | coherent multi-read, none, snapshot rung, no writes = no commit, — | Spanner RO; MVCC reads | (owned: stale-vs-latest expectations, GC’d versions) |
| Ledger txn → log.md ledger seat | balance invariant, append + check, —, the ledger’s own domain, escrow’s home | TigerBeetle transfers | (owned: unbalanced entry rejected at append; double spend; history edits) |
| Transactional outbox → retry’s flagship | state + intent, one DB commit, —, the DB transaction as domain, — | outbox + CDC | (owned: publish-twice means consumer dedupes; the domain ends at the DB) |
Vocabulary #
invariant boundary read set write set conflict range
optimistic pessimistic validate lock ordering deadlock convoy
isolation read committed snapshot isolation write skew SSI serializable
commit record prepare promise decision Unknown
coordinator participant heuristic outcome
atomicity domain commit scope cross-shard cross-system
compensation saga intermediate state
hot row contention sharded counter commutative escrow single writer
one door version publish
Deep Lesson #
Transaction bugs come from confusing pairs on different axes:
atomicity vs isolation (all-or-nothing ≠ nobody-sees-the-middle: axes 4 vs 3)
local transaction vs distributed (axis 4: domains have prices; know which you're paying)
commit request vs durable commit (axis 5: log.md's rungs — the ack certifies one)
rollback vs compensation (axis 4: undo exists inside a domain; forward actions outside)
retry vs safe retry (→ retry_idempotency: the body-effect trap)
snapshot isolation vs serializability (axis 3: write skew lives in the gap)
event publication vs database commit (→ the outbox: two domains, one honest bridge)
ledger entry vs mutable balance (→ log.md: facts are corrected by new facts)
Design procedure: name the invariant and try to shrink its scope in the schema first, measure contention before choosing a strategy, check the isolation rung and hunt the read-write cross, pay for exactly the commit scope the invariant requires, publish through one door with one durable bit — and when the invariant is hot, restructure it: split, commute, escrow, or hand it to a single writer, because the bill arrives either way. The named types are recognition shortcuts, not the design space.