Snapshot / Read View #
snapshot = a coherent version of state
read view = the rules deciding what a reader may see
It answers:
which state is visible to THIS read, while everything moves?
Role in the catalog: the visibility protocol — the reader’s side of the story. checkpoint_replay.md owns how state is captured; GC owns when it dies; this block owns what a reader is ENTITLED TO SEE in between. Every other file’s pin registry, publication protocol, and freshness ladder silently assumed a reader holding a coherent view; this is the block that defines one. It closes a symmetry the catalog has had half of all along: checkpoint_replay produces the coordinate; this block consumes it.
Central tension:
coherent reads and time travel vs storage, metadata, and GC cost
Design Axes (the core module) #
Axis 1 — The Visibility Coordinate (the structural cleave) #
What SINGLE VALUE determines what this read sees?
read timestamp / version MVCC, FDB read version, Spanner timestamp
manifest / snapshot pointer Iceberg, Delta, OCI image manifest
segment set Lucene IndexReader over committed segments
log index Raft snapshot's last-included index
config version + nonce xDS (checkpoint_replay's control-plane row)
projection offset read models (materialized.md)
This is checkpoint_replay.md’s binding coordinate*, READ-SIDE — the same coordinate, consumed instead of produced. And the deep-lesson row lives here:
a read timestamp is a POSITION IN A VERSION ORDER,
not a moment on anyone's wall clock.
Interrogation:
Name the coordinate. One value? (if a view needs two independently-
fetched values, the star* is already loose)
Who assigns it — the reader, a coordinator, a pointer swap?
What version order is it a position in, and who arbitrates that order?
(usually: log.md's ladder, or quorum.md's commit index)
Axis 2 — Materialization of the View #
metadata-only: the snapshot is a LIST of immutable units
(manifests, segment sets) — the cheapest and most
elegant: immutability makes coherence FREE
logical: versions interleaved in shared storage; visibility
computed per-row against the coordinate
(MVCC: read timestamp vs active-transaction set)
physical: actual blocks preserved (copy-on-write, backups) —
the dearest, and the only kind that survives the
source's destruction
Cost gradient: physical > logical > metadata. The industry’s drift toward immutable-segment architectures (Iceberg, Lucene, LSM) is the discovery that metadata-only views are nearly free once data units stop mutating — replication.md’s immutable-object lesson, read-side: you can’t tear a view of facts.
Interrogation:
Which materialization — and is the choice priced? (COW's write
amplification and snapshot-chain depth; MVCC's version bloat;
manifest's metadata growth)
Does the view survive the source? (only physical does — snapshot ≠
backup is the deep lesson's row: a manifest pointing at live storage
restores nothing after the storage burns)
Axis 3 — Lifetime and Pinning (the holder’s side of GC’s treaty) #
GC’s pin registry, seen from the pin-HOLDER’s chair. The treaty line: GC adjudicates; this block defines what a well-behaved pin looks like — bounded, declared, released.
The two-sided failure is the axis:
held too long: blocks vacuum, stalls compaction, inflates storage —
the eternal-transaction pin (GC's registry failure,
caused from this side)
released unclearly: the reader's data dies mid-read —
"segment merge deletes data during query,"
"old versions GC'd while reader needs them"
Interrogation:
When is the view acquired, and what EXPLICITLY releases it?
Is the pin declared to the registry, or conventional? (a reader GC
doesn't know about is a resurrection... of the reader's error)
What bounds the hold — a timeout, a query lifetime, a session?
(unbounded pins are how time travel becomes infinite retention)
Axis 4 — The Freshness Contract (deliberate staleness) #
The block’s founding trade, and the deep lesson’s first row promoted:
LATEST is not CONSISTENT.
a snapshot is DELIBERATELY stale by a bounded, named amount,
in exchange for coherence.
This is cache.md’s ladder with the direction reversed — the cache apologizes for staleness; the snapshot SELLS it. Sub-questions:
can this read see writes made after acquisition? never — that's the point
can this SESSION see its own writes? read-your-writes rides on
view refresh policy, not on
the view itself
what does "latest" mean to the caller? usually: "the newest
COHERENT view" — which is
older than the newest write,
by exactly the publication lag
Interrogation:
Is the staleness bound NAMED to the caller (snapshot age, offset lag)?
When does a session's view refresh — per query, per transaction, never?
Does anyone believe they're reading "now"? (disabuse them in the API)
Axis 5 — What Rides Inside the View #
data only
data + schema time travel across schema evolution — an old snapshot
read under a new schema is misinterpretation, not
history ("old schema cannot be interpreted")
data + config the xDS case: a coherent view must include the RULES
for reading it (routes without their clusters is a
torn view of config)
And the collision seated here:
time travel vs the right to be forgotten —
GC's legal-hold decree running AGAINST the pin registry.
a snapshot preserving deleted PII is a compliance event
wearing a feature's clothes. (boundary.md's data residency,
meeting axis 3's pins head-on.)
Technical Bottleneck: Torn Visibility* #
The one failure every configuration shares:
a reader observing a MIXTURE of versions that was never a state.
Mixing manifests across snapshots; a query straddling a segment merge; partial config applied; mixed projection generations; a distributed snapshot cutting through an in-flight message. And its epistemics are distinct:
a STALE view was once true.
a TORN view was NEVER true — an answer describing a world that never
existed, with no timestamp to interrogate, because no timestamp HAS
that state. silent omission*'s sibling: the other lie without a clock.
Known recipes — the catalog’s atomic-publication machinery, consumed read-side:
ONE DOOR (flagship) acquire the view through exactly one root —
the manifest pointer, the searcher, the read
version — and derive EVERYTHING from it.
never assemble a view from separately-fetched
parts. (GC's atomic publish is what makes the
single door exist; this block is why it must.)
immutability below units under the pointer never mutate — tearing
requires mutation, so facts can't tear
barriers where no single pointer exists, MANUFACTURE the
coordinate (Chandy-Lamport, Flink — the
distributed cut, checkpoint_replay's scope axis)
epoch-checked assembly when a view must span fetches, every part
carries the generation, and mismatch aborts
the read (lease_fencing's token, read-side)
The one-line discipline:
a coherent view is entered through exactly one door.
A strong design says explicitly:
the visibility coordinate, singular (axis 1),
its materialization and what that costs (axis 2),
the pin's bound, declaration, and release (axis 3),
the staleness sold, by name and amount (axis 4),
what rides inside — schema, config — and what deletion law
collides with retention (axis 5),
and the one door every reader enters through.
Snapshot As Protocol (the crossing-point spec — keep) #
select visibility coordinate (through the one door)
resolve visible state units
pin / declare to the registry
read against that view — ignoring newer state BY DESIGN
release the view, explicitly
GC proceeds once no live view needs the old data (the treaty)
MVCC instantiation:
reader obtains read timestamp
row visibility checked against snapshot's active-transaction set
writers create NEW versions (never mutate visible ones)
old versions retained while any snapshot needs them (the pin)
vacuum removes obsolete versions later (GC, adjudicating)
Manifest instantiation:
reader loads current snapshot POINTER (one door)
reads manifests + data files it references — nothing else
writer creates new files + new metadata; commit swaps the pointer
ATOMICALLY (GC's publication protocol)
old snapshots live until expiration (visibility proof of death)
Searcher instantiation:
reader opens IndexReader over committed segments (one door)
writers create new segments; commit publishes a new SET
merges rewrite old segments — deleted only after readers release
(the pin, per-searcher)
Named Configurations (lookup table) #
Vector = {coordinate, materialization, pin discipline, staleness contract, riders}. Rows marked → are owned elsewhere.
| Name | Vector | Canonical study object | Signature failure |
|---|---|---|---|
| MVCC read view | read timestamp, logical, active-txn set, per-txn coherent, data | Postgres MVCC; FDB read version | long reader blocks vacuum (axis 3); write skew (SI’s known hole); “expected latest, got snapshot” |
| Manifest snapshot | pointer, metadata-only, snapshot refs, per-pointer, data+schema | Iceberg snapshot model | mixed manifests*; GC eats referenced file (registry breach); commit race → torn metadata |
| Copy-on-write | snapshot root, physical, refcounts, —, data | ZFS/EBS snapshots | refcount bug kills live block; chain depth; space surprise (pins are invisible rent) |
| Log snapshot → checkpoint_replay.md | last-included index, state image, log-tail treaty, —, data | Raft snapshot install | snapshot/index mismatch (binding coordinate*); truncation before durable |
| Distributed snapshot → checkpoint_replay.md | barrier ID, per-actor + channels, coordinated, —, data+in-flight | Chandy-Lamport; Flink barriers | cut through a message* ; alignment backpressure |
| Backup/restore | restore point + log position, physical, retention policy, —, data+schema | base backup + WAL | not restorable (untested = hope); missing log tail; corrupt-found-late; residency on restore |
| Config snapshot → checkpoint_replay + xDS notes | version+nonce, metadata, last-good, applied-vs-current, config rides | Envoy xDS ACK/NACK | partial config applied* ; bad config poisons plane (last-good is the recipe) |
| Query snapshot | segment set, metadata, per-searcher pins, per-query coherent, data+schema | Lucene IndexReader | query straddles a merge*; schema change mid-query; stale routing on one server |
| Time travel | historical pointer, metadata over retained files, long pins, deliberately old, data+schema+law | Iceberg time travel; Git checkout | history expired; old schema unreadable; deletion-law collision (axis 5) |
| Read-model snapshot → materialized.md | projection offset, —, —, lag named, data | KStreams store + changelog offset | mixed generations*; “consumer assumes fresh” (axis 4’s disabusal, skipped) |
Vocabulary #
snapshot read view visibility coordinate version order
read timestamp active transaction set manifest pointer root
segment set searcher refcount copy-on-write chain
pin declare release bounded hold vacuum
one door torn view mixed generations
deliberate staleness view refresh read-your-writes
time travel AS OF schema-at-time deletion collision
last-good config applied vs current
Deep Lesson #
Snapshot bugs come from confusing pairs on different axes:
latest vs consistent (axis 4: the founding trade)
snapshot vs backup (axis 2: only physical survives the source)
manifest pointer vs data durability (axis 2 + GC's treaty: a pointer is a promise the registry must keep)
read timestamp vs wall-clock time (axis 1: a position, not a moment)
cache snapshot vs authoritative state (cache.md: the view is honest about being a copy)
time travel vs infinite retention (axis 3: pins are rent; axis 5: and sometimes illegal)
old version vs garbage (GC's proof of death: a pinned version is neither)
Design procedure: name the one coordinate and the one door, choose the materialization with its bill, bound and declare every pin, sell the staleness by name, list what rides inside the view and what law collides with keeping it — and never let a reader assemble reality from two separately-fetched parts, because the world they’d see never happened. The named types are recognition shortcuts, not the design space.