Configuration Management System (Distributed Config Push) #

This note models a distributed configuration management system where operators update configuration centrally, the control plane computes effective config, and versioned snapshots are pushed or pulled to many serving nodes safely.

Step 1 - Normalize #

Assume the baseline prompt is:

design a distributed configuration management system
admins update config centrally
many services or nodes consume config
config changes should propagate quickly
serving nodes should use coherent config versions
system scales across many tenants, services, and config scopes

Normalize into state-affecting paths.

Requirement	Actor	Operation	State touched	Priority
Admin creates or updates config	Admin	overwrite state	`S1` `update target` `ConfigDefinition`	C1
Admin rolls back or disables config version	Admin	state transition	`S1` `update target` `ConfigReleaseState`	C1
System computes effective config snapshot	System	state transition	`S1` `update target` `EffectiveConfigState`	C1
System propagates config snapshot to consumers	System	async process	`S1` `hidden write target` `ConsumerConfigSnapshot`	C1
Consumer reads local config for request handling	Client	read source	`S1` `read source target` `ConsumerConfigSnapshot`	C1
Consumer acknowledges applied config version	Client	overwrite state	`S1` `update target` `ConsumerApplyState`	R1
User reads config inventory / rollout status	Client	read projection	`S1` `read projection target` `ConfigStatusView`	R2
System routes config scope/shard to current owner	System	read source	`S1` `read source target` `PartitionMap`	C1
System reassigns shard ownership after node failure	System	state transition	`S1` `update target` `PartitionOwnership`	C1

Notes on normalization #

Important choices:

raw config edits are overwrite state
- current desired config is current-value truth
release/rollback is a lifecycle transition
- active version changes over time
effective config is a computed current view
propagation is async
consumer request handling reads local applied snapshots, not control-plane source state on every request

This system is fundamentally:

control plane + data plane

with:

versioned config
monotonic rollout

Step 2 - Critical Path Selection #

Requirement	Priority class	Why
Create / update config	C1	config truth changes future behavior system-wide
Roll back / disable config version	C1	rollback correctness affects safety and recovery
Compute effective config	C1	consumers depend on coherent derived config, not arbitrary fragments
Propagate config snapshot	C1	stale or mixed snapshots can break serving behavior
Consumer reads local config	C1	this is the hot serving path
Consumer acknowledges applied version	R1	useful for rollout control and debugging
Read config inventory / rollout status	R2	operational only
Route to shard owner	C1	wrong routing can split config truth
Reassign shard ownership	C1	failover must preserve config correctness

Baseline critical paths #

Main C1 paths:

P1 create/update config
P2 roll back / disable config version
P3 compute effective config snapshot
P4 propagate snapshot
P5 consumer local read
P6 route to shard owner
P7 reassign shard ownership

Main R1 path:

P8 consumer apply acknowledgment

This design is driven by:

one authoritative current config definition per scope
coherent effective config versions
monotonic consumer rollout

Step 3 - Primary State Extraction #

For a distributed config-push system, the minimal primary state is the current config definition, current release lifecycle, effective config state, consumer-applied snapshot state, and routing/ownership state.

Candidate object label	Candidate source	Candidate needed for C1/R1?	Candidate decomposition action	Class	Primary?	Owner	Evolution	Scope kind	Scope value
ConfigDefinition	direct noun	Yes	keep as candidate	entity	Yes	service	overwrite	instance	config_scope
ConfigReleaseState	lifecycle object	Yes	keep as candidate	process	Yes	service	state machine	instance	config_scope
EffectiveConfigState	hidden write target	Yes	keep as candidate	process	Yes	service	overwrite	instance	consumer_scope
ConsumerConfigSnapshot	hidden write target	Yes	keep as candidate	projection	Yes	service	overwrite	instance	consumer_id or consumer_scope
ConsumerApplyState	hidden write target	Yes	keep as candidate	entity	Yes	service	overwrite	instance	consumer_id + config_scope
PartitionOwnership	hidden write target	Yes	keep as candidate	process	Yes	service	state machine	instance	shard_id
PartitionMap	hidden write target	Yes	keep as candidate	entity	Yes	service	overwrite	collection	config shards
ConfigStatusView	derived read model	No	reject as UI artifact	projection	No	derived	overwrite	collection	tenant or service

Important modeling choices #

`ConfigDefinition` #

Primary because:

raw desired config is the central source of truth

`ConfigReleaseState` #

Primary because:

active, staged, rolled-back, disabled versions are lifecycle state

`EffectiveConfigState` #

Primary because:

consumers often need a resolved or merged config, not raw documents only

`ConsumerConfigSnapshot` #

Primary because:

hot-path serving reads this local or assigned snapshot

`ConsumerApplyState` #

Primary because:

rollout control depends on knowing what version each consumer has applied

Minimal strict primary set #

The strongest minimal set is:

ConfigDefinition
ConfigReleaseState
EffectiveConfigState
ConsumerConfigSnapshot
ConsumerApplyState
PartitionOwnership
PartitionMap

Step 4 - Hard Invariants #

For a distributed config-push system, the hard invariants are about one authoritative current config per scope, coherent effective config generation, and monotonic consumer snapshot application.

Path	Tier	Type	Invariant statement
`P1` create/update config	HARD	ordering	Config-definition revisions are ordered by monotonic version within config scope.
`P2` roll back / disable config version	HARD	eligibility	Action `advance_release_state` is valid only if current `ConfigReleaseState` allows the transition at decision time.
`P3` compute effective config	HARD	accounting	`EffectiveConfigState(scope, version)` equals the deterministic function of current authoritative config inputs and release state for that scope.
`P4` propagate snapshot	HARD	freshness	`ConsumerConfigSnapshot` reflects an authoritative `EffectiveConfigState` within configured propagation bounds and moves monotonically forward by version unless an explicit rollback transition is active.
`P5` consumer local read	HARD	freshness	Serving-node config reads reflect the currently applied `ConsumerConfigSnapshot` for that node/scope.
`P8` consumer apply acknowledgment	HARD	accounting	`ConsumerApplyState` reflects the highest config version actually applied by the consumer for that scope.
`P6` route to shard owner	HARD	uniqueness	Key `shard_id` maps to at most one logical outcome `current authoritative owner` within `shard_id`.
`P7` reassign shard ownership	HARD	eligibility	Action `reassign_shard` is valid only if `current owner is failed or relinquished and candidate owner is eligible and sufficiently current` on `shard_id` at decision time.

What matters most #

1. Config versions must be coherent #

Serving nodes should not read partially mixed config fragments for the same version.

2. Consumer snapshots must move monotonically #

Absent explicit rollback, a node must not regress to an older version.

3. Effective config must be deterministic #

Recomputing from the same inputs should yield the same effective snapshot.

4. Applied version is separate from published version #

Publishing a config is not the same as proving consumers are actually serving it.

Step 5 - Execution Context #

For the baseline distributed config-push system:

Field	Value	Why
Topology	single service distributed	one logical config-control system with many config consumers
Write coordination scope	per object scope	correctness is per config scope, consumer snapshot, and shard ownership scope
Read consistency target	bounded stale allowed	serving nodes usually read local snapshots with explicit freshness/version discipline
Holder model	none	consumers do not hold exclusive mutable business ownership
Compensation acceptable?	No	wrong or mixed config can cause production impact and is not safely compensable afterward

Derived implications #

holder_may_crash = false
- consumers may fail, but they do not own shared lock-like state
cross_service_write = false
- baseline keeps config truth, release state, and snapshot distribution in one logical system
bounded_staleness_allowed = true
- local snapshot reads can tolerate bounded lag if explicit
cross_service_atomicity_required = false
- no multi-service transaction across unrelated services in baseline
exclusive_claim_required = true
- shard ownership must be exclusive
guarded_by_current_state = true
- release and rollback transitions depend on current release state

What this implies #

This pushes us toward:

one authoritative owner per config shard
current-value config and release state
derived effective snapshots
monotonic local snapshot application on consumers

Step 6 - Deterministic Mechanism Selection #

Path	Write shape	Base mechanism	Required companions
`P1` create/update config	overwrite current value	CAS on version	config version
`P2` roll back / disable config version	guarded state transition	CAS on `(state, version)`	release version
`P3` compute effective config	overwrite current value	single writer control-plane recompute	effective-config version
`P4` propagate snapshot	overwrite current value	single writer snapshot publication	snapshot version
`P5` consumer local read	read source	local snapshot read	applied version
`P8` consumer apply acknowledgment	overwrite current value	monotonic overwrite	applied version
`P6` route to shard owner	exclusive claim	lease	fencing token, heartbeat
`P7` reassign shard ownership	guarded state transition	CAS on `(state, version)`	fencing token, shard catch-up check

Why these fit #

Config definitions #

Current desired config is current-value state, so overwrite fits.

Release lifecycle #

Promote, disable, and rollback depend on current state, so guarded transition fits.

Effective config and consumer snapshots #

These are current resolved views, so overwrite fits.

Apply acknowledgment #

Consumers report their highest applied version, so monotonic overwrite fits.

Canonical substrate implied #

The baseline now points to:

sharded config-control service
one owner per config scope
current-value config and release state
derived effective config
local consumer snapshots with monotonic rollout

Step 7 - Read Model / Source of Truth #

For a distributed config-push system, truth is mostly direct source state plus consumer snapshots. Status UIs are derived.

Concept	Truth	Read path	Rebuild path
`C1` desired config	`ConfigDefinition`	read source directly	authoritative config store
`C2` active release lifecycle	`ConfigReleaseState`	read source directly	authoritative release-state store
`C3` resolved effective config	`EffectiveConfigState`	read source directly	recompute from config definitions and release state
`C4` consumer local snapshot	`ConsumerConfigSnapshot`	materialized view	rebuild from latest effective config
`C5` consumer applied version	`ConsumerApplyState`	read source directly	authoritative apply-state store
`C6` shard ownership	`PartitionOwnership`	read source directly	authoritative ownership store
`C7` shard routing map	`PartitionMap`	read source directly	authoritative routing metadata
`C8` config rollout status	derived from definitions, releases, and apply state	materialized view	recompute from authoritative state

Important point #

For the core semantics:

control-plane truth lives in config definitions, release state, and effective config
serving nodes read local snapshots
rollout status is derived from consumer apply state

Step 8 - Failure Handling #

Path	Retry	Competing writers	Crash after commit	Publish failure	Stale holder
`P1` create/update config	retry with config version	stale update loses CAS	committed config survives crash if persisted	snapshot recompute may lag	stale shard owner blocked by fencing token
`P2` release/rollback transition	retry with release version	stale transition loses guarded update	committed release state survives crash if persisted	consumer propagation may lag	stale shard owner blocked by fencing token
`P3` effective-config recompute	recompute retry safe from source inputs	single recompute/version wins	recompute reruns after crash	consumer propagation may lag	n/a
`P4` snapshot propagation	retry with versioned snapshot	older snapshot loses to newer version unless explicit rollback version is active	consumer keeps last good snapshot until refresh	failed push retried or pulled	n/a
`P5` consumer local read	request retry safe	many consumers read concurrently from same local snapshot	consumer crash drops requests only	n/a	stale snapshot bounded by configured propagation freshness
`P8` apply acknowledgment	retry with highest applied version	stale/lower version report loses monotonic overwrite	applied version survives crash if persisted	rollout UI may lag	n/a
`P6` route to shard owner	retry after refreshing shard map	only one valid owner should exist	if owner changed, refreshed map points to new owner	n/a	stale owner rejected by fencing token
`P7` reassign shard ownership	retry failover transition safely	only one reassignment wins current ownership state	promoted owner crash triggers later reassignment	n/a	old owner fenced and must not continue serving

What matters most #

1. Monotonic snapshot movement #

Consumers must not accidentally apply older config after a newer one, except under an explicit rollback model.

2. Coherent version boundaries #

Consumers should apply config snapshots atomically at version boundaries, not field by field.

3. Published versus applied version #

Operators need both:

latest published version
latest actually applied version per consumer

4. Rollback is a first-class lifecycle transition #

Rollback is not “stale propagation”; it is a deliberate new release-state transition.

Step 9 - Scale Adjustments #

Hotspot	Type	First response
very large consumer fleets	fan-out hotspot	hierarchical config distribution or pull-after-notify
high config churn	contention hotspot	batch edits and incremental recompute by affected scope
large config snapshots	read/memory hotspot	shard config by service/scope and compress snapshots
rollout-status queries	read hotspot	serve from derived views over `ConsumerApplyState`
reconnect storms	contention hotspot	stagger re-syncs and use version-based delta fetch
mixed-version safety checks	control-plane hotspot	validate rollout invariants before publish and during progressive rollout

What scales well #

This system scales by:

sharding config scopes
pushing/pulling versioned snapshots rather than source reads on every request
incrementally recomputing only affected effective config
separating status reporting from the hot serving path

What fails first #

Usually:

fleet-wide fanout storms
large snapshots
rapid config churn
rollout status reads hitting primary state

Canonical design conclusion #

The mechanical outcome is:

primary state:
- ConfigDefinition
- ConfigReleaseState
- EffectiveConfigState
- ConsumerConfigSnapshot
- ConsumerApplyState
- PartitionOwnership
- PartitionMap
critical invariants:
- one authoritative current config per scope
- deterministic effective config generation
- monotonic consumer snapshot application unless explicit rollback
- exclusive shard ownership for config truth
mechanisms:
- overwrite current value for config definitions
- guarded release/rollback transitions
- overwrite effective snapshots
- monotonic applied-version tracking
- fenced shard ownership
reads:
- hot path from local applied snapshots
- status and rollout views from derived projections

Polished interview answer #

I’d build the config system as a control-plane/data-plane service. The control plane owns authoritative config definitions and release lifecycle, computes a deterministic effective config for each scope, and publishes versioned snapshots to consumers. Serving nodes never fetch raw control-plane config on every request; they read local snapshots atomically and move forward monotonically by config version. Rollback is modeled as a first-class release-state transition, not as accidental stale propagation, and consumers report their applied versions back so rollout status is observable. The main scaling levers are sharding config scopes, hierarchical or delta-based distribution, incremental recompute, and keeping rollout dashboards off the serving hot path.

Concrete Substrate #

I’ll choose a control-plane/data-plane config system with authoritative config shards plus local consumer snapshots as the concrete baseline, because it matches the mechanics we derived:

current-value config and release state
derived effective snapshots
monotonic snapshot publication
one owner per shard

Concrete tech family:

control plane in Go or Java
authoritative state store:
- replicated DB or RocksDB-backed service state
metadata/control:
- etcd or internal metadata quorum for shard ownership/routing
distribution layer:
- watch stream, long poll, or push channel to consumers

Each shard owner stores:

current config definitions
current release state
current effective config per consumer scope
rollout/apply status from consumers

Consumers store:

local ConsumerConfigSnapshot
current applied version

Operation Layer #

1. Update config #

API

PutConfig(scope, config_doc, expected_version?)

Initiator

admin

Entry point

config API

Authoritative decider

shard owner for config scope

Precondition

config version matches if optimistic concurrency used

Transition

overwrite ConfigDefinition
trigger EffectiveConfigState recompute

2. Promote or roll back config #

API

UpdateReleaseState(scope, action, target_version, expected_release_version?)

Initiator

admin

Entry point

release API

Authoritative decider

shard owner for config scope

Precondition

current release state allows requested transition

Transition

guarded update of ConfigReleaseState
trigger snapshot propagation

3. Propagate snapshot #

API

internal push or pull-after-notify flow

Initiator

system

Entry point

control plane / consumer

Authoritative decider

snapshot publisher

Precondition

newer effective config version exists

Transition

overwrite ConsumerConfigSnapshot

4. Consumer apply and ack #

API

AckAppliedConfig(scope, version, consumer_id)

Initiator

consumer

Entry point

rollout status endpoint

Authoritative decider

shard owner for config scope

Precondition

version is applied locally

Transition

monotonic overwrite ConsumerApplyState

5. Consumer read on hot path #

API

internal local read

Initiator

consumer/service

Entry point

local process

Authoritative decider

local applied ConsumerConfigSnapshot

Precondition

snapshot loaded and valid

Transition

none

Entry Point vs Decider vs Responder #

Path	Entry point	Authoritative decider	Physical responder	Logical responder
config update / release update	config API	config shard owner	API node	config system
snapshot propagation	control plane / consumer	snapshot publisher	control/data-plane node	config system
consumer local read	local process	local applied snapshot	local process	config system
apply ack	rollout endpoint	config shard owner	API node	config system
shard failover	follower / coordination layer	shard quorum / lease store	new leader / control plane	config system

Concrete HLD #

Main components:

config control-plane API
- receives config and release updates
config shard owners
- authoritative owners for config truth and effective snapshot recompute
distribution layer
- pushes or serves versioned snapshots to consumers
consumer fleet
- reads local applied snapshots on hot path
metadata/control service
- tracks shard ownership and routing
rollout status pipeline
- serves config inventory and rollout status

Short Interview Version #

I’d build the config system as a control-plane/data-plane service. The control plane owns authoritative config definitions and release lifecycle, computes a deterministic effective config for each scope, and publishes versioned snapshots to consumers. Serving nodes never fetch raw control-plane config on every request; they read local snapshots atomically and move forward monotonically by config version. Rollback is modeled as a first-class release-state transition, not as accidental stale propagation, and consumers report their applied versions back so rollout status is observable. The main scaling levers are sharding config scopes, hierarchical or delta-based distribution, incremental recompute, and keeping rollout dashboards off the serving hot path.

Configuration Management System (Distributed Config Push) #

Step 1 - Normalize #

Notes on normalization #

Step 2 - Critical Path Selection #

Baseline critical paths #

Step 3 - Primary State Extraction #

Important modeling choices #

ConfigDefinition #

ConfigReleaseState #

EffectiveConfigState #

ConsumerConfigSnapshot #

ConsumerApplyState #

Minimal strict primary set #

Step 4 - Hard Invariants #

What matters most #

1. Config versions must be coherent #

2. Consumer snapshots must move monotonically #

3. Effective config must be deterministic #

4. Applied version is separate from published version #

Step 5 - Execution Context #

Derived implications #

What this implies #

Step 6 - Deterministic Mechanism Selection #

Why these fit #

Config definitions #

Release lifecycle #

Effective config and consumer snapshots #

Apply acknowledgment #

Canonical substrate implied #

Step 7 - Read Model / Source of Truth #

Important point #

Step 8 - Failure Handling #

What matters most #

1. Monotonic snapshot movement #

2. Coherent version boundaries #

3. Published versus applied version #

4. Rollback is a first-class lifecycle transition #

Step 9 - Scale Adjustments #

What scales well #

What fails first #

Canonical design conclusion #

Polished interview answer #

Concrete Substrate #

Operation Layer #

1. Update config #

2. Promote or roll back config #

3. Propagate snapshot #

4. Consumer apply and ack #

5. Consumer read on hot path #

Entry Point vs Decider vs Responder #

Concrete HLD #

Short Interview Version #

`ConfigDefinition` #

`ConfigReleaseState` #

`EffectiveConfigState` #

`ConsumerConfigSnapshot` #

`ConsumerApplyState` #