Dynamo Process And Storage Traces #
Source: dynamo.pdf
Archetype Read #
Dynamo is primarily a storage system.
Dominant archetypes:
I16 Replicated KVI09 Shared Subject + Mergeable Editsfor divergent versions and reconciliation
Supporting process shapes:
membership viewsnapshot/distribution applyfor routing and membership propagationrepair / anti-entropy batchhinted handoffas a deferred transfer workflow
So unlike Borg:
- Borg is mostly process, with storage as control substrate
- Dynamo is mostly storage, with process as replication/repair substrate
Main Judgment #
The main storage truth in Dynamo is:
- a key’s replicated multi-version state
- plus causal metadata
- plus quorum visibility rules
The main process logic around it is:
- coordinator selection
- sloppy quorum execution
- hinted handoff
- gossip membership
- anti-entropy / read repair
1. Storage Trace: Replicated Key Version Under Quorum #
This is the core Dynamo trace.
Core lines #
coordinator_visible_versionreplica_versionswrite_quorum_stateread_visible_versionsclient_trying_put_or_get
Reusable frame mapping #
truth locus->coordinator_visible_versionplusreplica_versionsreplica/derived state-> per-replica object copiesdurability boundary->write_quorum_statevisibility->read_visible_versionsactor trying to write/read/apply->client_trying_put_or_get
Trace #
time →
coordinator_visible_version D1 -------------- D2 pending -------- D2 ---------------
replica_A_version D1 -------------- D2 --------------- D2 ---------------
replica_B_version D1 -------------- D1 --------------- D2 ---------------
write_quorum_state W@D1 ------------ W pending -------- W@D2 -------------
read_visible_versions {D1} ------------ {D1} ------------ {D2} -------------
client_trying_put_or_get none ------------ put(D2) --------- get()->{D2} ------
What it teaches #
- a write can exist on some replicas before it reaches the configured write quorum
- read visibility follows the quorum/read path, not just one replica’s local state
- storage truth is a replicated key-version state, not a single leader value
State loci #
- authoritative:
- replica set for the key under quorum policy
- local/replica:
- each replica node’s local object version
- cached/derived:
- coordinator’s gathered version set, client read result
- repair:
- replication catch-up / read repair
Core invariant #
- a version must not be treated as successfully durable/visible under the configured policy before enough replicas have accepted it
2. Storage Trace: Divergent Versions And Vector Clock Causality #
This is the distinctive Dynamo storage shape.
Core lines #
object_versionsvector_clock_frontierreplica_A_versionsreplica_B_versionsclient_trying_reconcile
Reusable frame mapping #
truth locus->object_versionsreplica/derived state->replica_A_versions,replica_B_versionsdurability/currentness boundary->vector_clock_frontiervisibility-> versions surfaced to the client on readactor trying to write/read/apply->client_trying_reconcile
Trace #
time →
object_versions {D1} ------------ {D2,D3} ---------------- {D4} -----------
vector_clock_frontier [x:1] ----------- [x:2] || [y:1] ---------- [x:2,y:1,z:1]
replica_A_versions {D2} ------------ {D2} -------------------- {D4} -----------
replica_B_versions {D1} ------------ {D3} -------------------- {D4} -----------
client_trying_reconcile none ------------ get()->{D2,D3} ---------- merge put(D4) --
What it teaches #
- currentness is not always a single scalar version
- the truth object may contain concurrent branches
- reads can surface multiple versions that the client must reconcile
State loci #
- authoritative:
- distributed per-key version set with vector clocks
- local/replica:
- each replica’s local branch/head set
- cached/derived:
- client-visible returned versions and reconcile context
- repair:
- client merge followed by writeback
Core invariant #
- versions that are causally unrelated must not be silently discarded as if one subsumed the other
3. Storage Trace: Read Repair #
Dynamo uses reads as an opportunistic repair moment.
Core lines #
coordinator_returned_versionstale_replica_versionfresh_replica_versionrepair_writeback_statecoordinator_trying_read_repair
Reusable frame mapping #
truth locus->coordinator_returned_versionreplica/derived state->stale_replica_version,fresh_replica_versiondurability/freshness boundary-> causally newer version already known by coordinatorvisibility-> read-returned valueactor trying to write/read/apply->coordinator_trying_read_repair
Trace #
time →
coordinator_returned_version D4 ------------- D4 ------------- D4
stale_replica_version D2 ------------- D2 ------------- D4
fresh_replica_version D4 ------------- D4 ------------- D4
repair_writeback_state none ----------- repairing ------ repaired
coordinator_trying_read_repair get()->{D4} --- writeback D4 --- done
What it teaches #
- a read may reveal stale replicas
- coordinator can repair stale copies while serving the read
- visibility and repair are coupled
State loci #
- authoritative:
- freshest causally acceptable replica set
- local/replica:
- stale and fresh replicas holding different versions
- cached/derived:
- coordinator’s assembled read result
- repair:
- read-repair writeback from coordinator
Core invariant #
- stale replicas discovered on a read must not remain authoritative if a causally newer version is already known
4. Storage Trace: Hinted Handoff Replica Durability #
This is a storage-support trace because the system temporarily stores a replica on the wrong node.
Core lines #
intended_replica_setreachable_write_sethinted_replica_statedurable_replica_countcoordinator_trying_handoff
Reusable frame mapping #
truth locus->intended_replica_setreplica/derived state->reachable_write_set,hinted_replica_statedurability boundary->durable_replica_countvisibility-> eventual intended replica durability after handoffactor trying to write/read/apply->coordinator_trying_handoff
Trace #
time →
intended_replica_set {A,B,C} ---------- {A,B,C} ---------- {A,B,C}
reachable_write_set {B,C,D} ---------- {B,C,D} ---------- {A,B,C}
hinted_replica_state none ------------- D holds hint(A) --- delivered to A
durable_replica_count 2+hint ----------- 2+hint ----------- 3 intended
coordinator_trying_handoff put via D -------- retain hint ------ transfer then delete
What it teaches #
- sloppy quorum changes where durability temporarily lives
- the hinted copy is part of the durability story until returned
- replica locality and durability locality can diverge
State loci #
- authoritative:
- intended replica set for the key
- local/replica:
- reachable nodes including hint holder
- cached/derived:
- hint metadata showing intended recipient
- repair:
- hinted handoff transfer path
Core invariant #
- a hinted replica must preserve durability until the intended replica safely receives the object
5. Storage Trace: Merkle Anti-Entropy #
This is Dynamo’s background replica synchronization substrate.
Core lines #
range_merkle_root_Arange_merkle_root_Bdivergence_scoperepair_transfer_stateactor_trying_anti_entropy
Reusable frame mapping #
truth locus-> converged key-range contents summarized by Merkle rootsreplica/derived state->range_merkle_root_A,range_merkle_root_Bdurability/freshness boundary-> anti-entropy convergence over divergent scopevisibility-> synchronized replica rangeactor trying to write/read/apply->actor_trying_anti_entropy
Trace #
time →
range_merkle_root_A h100 ------------ h100 ------------- h101
range_merkle_root_B h100 ------------ h095 mismatch ---- h101
divergence_scope none ------------ subtree/key range - none
repair_transfer_state idle ------------ syncing ---------- synchronized
actor_trying_anti_entropy compare roots ---- descend tree ---- transfer missing keys
What it teaches #
- anti-entropy is a range-level repair process over storage truth
- Merkle roots compress the comparison frontier
- repair targets only divergent subtrees/keys
State loci #
- authoritative:
- actual key-range contents on each replica
- local/replica:
- each node’s Merkle tree for the range
- cached/derived:
- exchanged root/subtree hashes
- repair:
- anti-entropy subtree walk and key transfer
Core invariant #
- replicas for a key range must eventually converge by transferring only the divergent keys or subranges
6. Process Trace: Put Coordination Under Sloppy Quorum #
This is the main Dynamo process trace.
Core lines #
request_statecoordinator_nodereachable_preference_nodeswrite_ack_progressactor_trying_to_coordinate
Reusable frame mapping #
state->request_stateowner/view->coordinator_node,reachable_preference_nodesmonotonic marker->write_ack_progressvalidity boundary-> configuredWthreshold under current reachable setactor trying to advance->actor_trying_to_coordinate
Trace #
time →
request_state RECEIVED -------- FORWARDED -------- ACKED ---------- REPAIR_PENDING
coordinator_node Sx -------------- Sx -------------- Sx ------------- Sx
reachable_preference_nodes {A,B,D} --------- {A,B,D} --------- W satisfied ---- hinted handoff left
write_ack_progress 0/2 ------------- 1/2 ------------ 2/2 ------------ 2/2 + hint
actor_trying_to_coordinate accept put ------ send to replicas - reply success -- schedule handoff
What it teaches #
- the coordinator is a process role, not the long-term truth locus
- write success is tied to
Wacknowledgements, not necessarily the full intended replica set - sloppy quorum leaves deferred repair work behind
State loci #
- authoritative:
- request outcome as determined by coordinator plus replica acknowledgements
- local/execution:
- coordinator node handling the put
- cached/derived:
- reachable preference-list view during the request
- repair:
- hinted handoff / later replica synchronization
Core invariant #
- a put must not be acknowledged as successful before the configured write threshold is met
7. Process Trace: Membership View And Preference List #
Dynamo relies on gossip-based membership and eventually consistent routing views.
Core lines #
member_setmembership_epoch_or_history_tippreference_list_for_keystaleness_boundaryactor_trying_to_route_request
Reusable frame mapping #
state-> local routing/membership state for a keyowner/view->member_set,preference_list_for_keymonotonic marker->membership_epoch_or_history_tipvalidity boundary->staleness_boundaryactor trying to advance->actor_trying_to_route_request
Trace #
time →
member_set {A,B,C,D} -------- {A,B,C,D,E} ------ {A,B,C,D,E}
membership_history_tip m40 -------------- m41 -------------- m41
preference_list_for_key [A,B,C] ---------- [A,B,E] ---------- [A,B,E]
staleness_boundary fresh ------------ stale at client --- refreshed
actor_trying_to_route_request client->A ----- client uses old C -- client refreshes
What it teaches #
- membership and routing views can lag
- clients may briefly route using stale membership
- preference list is a derived control view from membership/ring state
State loci #
- authoritative:
- persisted membership change history across nodes
- local/execution:
- client or node using its current preference-list view
- cached/derived:
- local routing table / preference list cache
- repair:
- membership refresh / gossip reconciliation
Core invariant #
- request routing must eventually converge to the current preference list, and stale routing views must be refreshable
8. Process Trace: Hinted Handoff Workflow #
This is a good process-shaped Dynamo workflow.
Core lines #
hint_stateholder_nodeintended_recipientdelivery_deadline_or_retryactor_trying_to_transfer_hint
Reusable frame mapping #
state->hint_stateowner/view->holder_node,intended_recipientmonotonic marker-> hint lifecycle progressionvalidity boundary->delivery_deadline_or_retryactor trying to advance->actor_trying_to_transfer_hint
Trace #
time →
hint_state CREATED -------- STORED ---------- IN_TRANSFER ----- CLEARED
holder_node D -------------- D --------------- D -------------- none
intended_recipient A -------------- A --------------- A -------------- A
delivery_deadline_or_retry t10 ------------ t20 ------------ retry now ------ -
actor_trying_to_transfer_hint none --------- wait for A ------ send replica ---- delete hint
What it teaches #
- hinted handoff is a deferred repair workflow over storage truth
- the hint has its own lifecycle
- correct deletion happens only after successful transfer
State loci #
- authoritative:
- hint state on the holder until transfer completes
- local/execution:
- node currently storing and sending the hint
- cached/derived:
- intended-recipient metadata and retry state
- repair:
- handoff retry loop until recipient accepts
Core invariant #
- a hint must not be dropped before the intended replica safely receives the data
9. Process Trace: Gossip Membership Reconciliation #
This is the main decentralized process substrate.
Core lines #
local_membership_historypeer_membership_historyreconciled_history_tipnext_gossip_tickactor_trying_to_gossip
Reusable frame mapping #
state-> local reconciliation progress between peersowner/view->local_membership_history,peer_membership_historymonotonic marker->reconciled_history_tipvalidity boundary->next_gossip_tickactor trying to advance->actor_trying_to_gossip
Trace #
time →
local_membership_history m40 ------------- m40,m41 --------- m41
peer_membership_history m40 ------------- m40 ------------ m40,m41
reconciled_history_tip m40 ------------- reconciling ----- m41
next_gossip_tick t1 -------------- t2 ------------- t3
actor_trying_to_gossip A<->B ---------- exchange -------- merge histories
What it teaches #
- membership is maintained as propagated history
- nodes reconcile histories, not just one current scalar bit
- process truth is eventually converged through repeated pairwise sync
State loci #
- authoritative:
- distributed merged membership history
- local/execution:
- each node’s current local history view
- cached/derived:
- peer snapshot exchanged during gossip
- repair:
- periodic gossip reconciliation
Core invariant #
- membership change histories must merge monotonically and must not lose previously observed changes
Minimum Dynamo Trace Set #
If you only want the representative Dynamo traces, drill these:
storage:
Replicated Key Version Under QuorumDivergent Versions And Vector Clock CausalityHinted Handoff Replica DurabilityMerkle Anti-Entropy
process:
Put Coordination Under Sloppy QuorumHinted Handoff WorkflowMembership View And Preference List
That is enough to capture the main Dynamo shapes.