Distributed Counter Service Analysis Note
Table of Contents
Distributed Counter Service Analysis Note #
This note captures the full step-by-step analysis from the infra interview recipe for an exact distributed counter service.
Step 1 — Normalize #
Assume the baseline prompt is:
- design a distributed counter service
- clients can increment, decrement, and read counters
- counters should scale across nodes
- assume exact values unless told otherwise
Normalize into state-affecting paths.
| Requirement | Actor | Operation | State touched | Priority |
|---|---|---|---|---|
| Client increments counter | Client | state transition | S1update targetCounterState | C1 |
| Client decrements counter | Client | state transition | S1update targetCounterState | C1 |
| Client reads current counter value | Client | read source | S1read source targetCounterState | R1 |
| System routes counter key to shard owner | System | read source | S1read source targetPartitionMap | C1 |
| System reassigns shard ownership after node failure | System | state transition | S1update targetPartitionOwnership | C1 |
| System replicates committed counter updates | System | async process | S1hidden write targetReplicaState | C1 |
| Client reads usage/metrics dashboard | Client | read projection | S1read projection targetCounterStatsView | R2 |
Notes on normalization:
Important choices:
- increment/decrement are
state transition- because current counter state changes under arithmetic rules
- read current value is
read source - routing, ownership, and replication are explicit because it is distributed infra
Likely C1:
- increment
- decrement
- routing
- shard ownership
- replication
Likely R1:
- read current counter value
Step 2 — Critical Path Selection #
| Requirement | Priority class | Why |
|---|---|---|
| Increment counter | C1 | wrong count directly breaks product truth |
| Decrement counter | C1 | same as increment; current value must remain exact |
| Read current counter value | R1 | core serving path |
| Route counter key to shard owner | C1 | wrong routing can split updates and corrupt exactness |
| Reassign shard ownership after node failure | C1 | failover must preserve exact count and authority |
| Replicate committed counter updates | C1 | durability and replica convergence depend on it |
| Read dashboard/stats | R2 | secondary operational view |
Baseline critical paths:
Main C1 paths:
P1incrementP2decrementP3route to shard ownerP4reassign shard ownershipP5replicate committed updates
Main R1 path:
P6read counter value
For an exact distributed counter, the core issue is:
- one authoritative update order per counter key
- no lost increments/decrements
- correct failover and replica catch-up
So this is closer to:
- distributed KV / register-with-arithmetic transitions
than to:
- approximate telemetry or CRDT counters
Step 3 — Primary State Extraction #
For an exact distributed counter service, the minimal primary state is the current counter value plus ownership/routing/replica state.
| Candidate object label | Candidate source | Candidate needed for C1/R1? | Candidate decomposition action | Class | Primary? | Owner | Evolution | Scope kind | Scope value |
|---|---|---|---|---|---|---|---|---|---|
| CounterState | direct noun | Yes | keep as candidate | process | Yes | service | state machine | instance | counter_key |
| PartitionOwnership | hidden write target | Yes | keep as candidate | process | Yes | service | state machine | instance | shard_id |
| PartitionMap | hidden write target | Yes | keep as candidate | entity | Yes | service | overwrite | collection | counter keyspace shards |
| ReplicaState | hidden write target | Yes | keep as candidate | process | Yes | service | state machine | relation | shard_id + replica_id |
| CounterStatsView | derived read model | No | reject as UI artifact | projection | No | derived | overwrite | collection | shard or cluster |
| MutationLog | hidden write target | No | reject as implementation choice | event | No | derived | append-only | collection | counter_key or shard |
Important modeling choices:
CounterState #
This is the main truth object.
Likely fields:
valueversion- maybe
updated_at
I’m marking it as a process because:
- increments/decrements are state transitions on current value
- if you prefer,
entitywith overwrite evolution is also defensible - the key point is that current value is authoritative
PartitionOwnership #
Needed because:
- one owner must serialize exact updates for counters in that shard
PartitionMap #
Needed because:
- exactness requires consistent routing of a counter key to one authority
ReplicaState #
Needed because:
- failover must know which replicas are caught up enough to promote
Minimal strict primary set:
CounterStatePartitionOwnershipPartitionMapReplicaState
Step 4 — Hard Invariants #
For an exact distributed counter service, the hard invariants are about one authoritative update order per counter, exact arithmetic transitions, and valid ownership/failover.
| Path | Tier | Type | Invariant template | Invariant statement |
|---|---|---|---|---|
P1write pathIncrement counter | HARD | eligibility | eligibility template | Action increment_counter is valid only if request is routed to current authoritative owner for counter_key and current counter state/version allows the transition at decision time. |
P1write pathIncrement counter | HARD | ordering | ordering template | Instances counter mutations are ordered by authoritative commit order within counter_key. |
P2write pathDecrement counter | HARD | eligibility | eligibility template | Action decrement_counter is valid only if request is routed to current authoritative owner for counter_key and current counter state/version allows the transition at decision time. |
P2write pathDecrement counter | HARD | ordering | ordering template | Instances counter mutations are ordered by authoritative commit order within counter_key. |
P3write pathRoute to shard owner | HARD | uniqueness | uniqueness template | Key shard_id maps to at most one logical outcome current authoritative owner within shard_id. |
P4write pathReassign shard ownership | HARD | eligibility | eligibility template | Action reassign_shard is valid only if current owner is failed or relinquished and candidate replica is eligible and sufficiently current on shard_id at decision time. |
P5write pathReplicate committed updates | HARD | accounting | accounting template | Replica state for shard_id equals authoritative committed counter mutation stream modulo bounded replication lag. |
P6read pathRead counter value | HARD or SOFT | freshness | freshness template | Read path read_counter reflects authoritative counter state within configured consistency bound. |
What matters most:
1. One authoritative order per counter key #
This is the core of exact counting. If two unsynchronized owners both accept increments for the same key, exactness breaks.
2. Exact arithmetic transitions #
No lost increment No duplicate decrement No stale overwrite of current value
3. Valid failover #
A stale replica must not become new owner if it would lose committed updates.
4. Read consistency choice #
If strong reads are required:
- reads must come from authoritative state If stale reads are allowed:
- follower reads become possible
Step 5 — Execution Context #
For the exact distributed counter baseline:
| Field | Value | Why |
|---|---|---|
| Topology | single service distributed | one logical counter service across many nodes |
| Write coordination scope | per object scope | correctness is per counter_key and per shard ownership scope |
| Read consistency target | strong only | safest baseline for exact counter reads |
| Holder model | node | shard ownership is held by nodes |
| Compensation acceptable? | No | lost or duplicated counter updates cannot be repaired cleanly after the fact in the exact model |
Derived implications:
holder_may_crash = true- shard owners can fail while serving updates
cross_service_write = false- baseline keeps counter state, routing, and replication inside one logical service
bounded_staleness_allowed = false- exact baseline disallows stale reads on the main read path
cross_service_atomicity_required = false- no multi-service transaction is needed in the baseline
exclusive_claim_required = true- shard ownership must be exclusive
guarded_by_current_state = true- arithmetic transitions and failover depend on current state/version
This pushes us toward:
- one authoritative writer per shard
- ordered mutation stream for exact updates
- lease-backed shard ownership
- strong reads from authoritative source by default
So the likely design is very similar to a leader-based partitioned KV store, except the value semantics are arithmetic rather than arbitrary overwrite.
Step 6 — Deterministic Mechanism Selection #
6A. Write Shape #
| Path | Why | Write shape |
|---|---|---|
P1 increment counter | valid only against current counter state/version | guarded state transition |
P2 decrement counter | same as increment; current state matters | guarded state transition |
P3 route to shard owner | one current authoritative owner per shard | exclusive claim |
P4 reassign shard ownership | valid only if current owner failed/relinquished and candidate is current enough | guarded state transition |
P5 replicate committed updates | followers consume authoritative mutation stream | append-only event |
6B. Base Mechanism #
| Path | Write shape | Base mechanism | Required companions |
|---|---|---|---|
P1 increment counter | guarded state transition | single writer per partition | version, commit index |
P2 decrement counter | guarded state transition | single writer per partition | version, commit index |
P3 route to shard owner | exclusive claim | lease | fencing token, heartbeat |
P4 reassign shard ownership | guarded state transition | CAS on (state, version) | fencing token, replica catch-up check |
P5 replicate committed updates | append-only event | append log | replica progress / ack index |
Why these fit:
Increment / decrement #
Even though arithmetic feels special, for an exact distributed system the simplest safe realization is:
- one authoritative shard owner serializes updates for keys in that shard
So the hot path is:
- single writer per partition
- not multi-writer merge
Routing / ownership #
Still classic exclusive-claim semantics:
- one shard owner at a time
Replication #
Exact counter updates are best replicated as an ordered mutation stream.
Canonical substrate implied:
- partitioned leader-based counter service
- one leader per shard
- updates appended to shard mutation log
- replicas follow the committed stream
- reads from leader for strong exact value
Step 7 — Read Model / Source of Truth #
For an exact distributed counter service, truth is mostly direct source state plus replica progress.
| Concept | Truth | Read path | Rebuild path |
|---|---|---|---|
C1source conceptCurrent counter value | CounterState on authoritative shard owner | read source directly | authoritative shard state / replay from committed mutation log |
C2source conceptShard ownership | PartitionOwnership | read source directly | authoritative ownership store |
C3source conceptShard routing map | PartitionMap | read source directly | authoritative routing metadata |
C4status conceptReplica catch-up state | ReplicaState | read source directly | recompute from commit index and replica progress |
C5projection conceptCounter analytics / dashboard | derived from counter mutations or snapshots | materialized view | recompute from snapshots/logs |
Important point:
For the exact baseline:
- reads of current counter value should come from authoritative source state
- ownership and routing metadata should be authoritative
- operational/analytics views are projections only
Optional variant:
- if interviewer later allows stale reads:
- follower reads become possible
- but source-of-truth is still authoritative shard state
Step 8 — Failure Handling #
| Path | Retry | Competing writers | Crash after commit | Publish failure | Stale holder |
|---|---|---|---|---|---|
P1 increment counter | retry must be idempotent by request id or may double-increment | shard owner serializes competing updates for same key | committed increment survives leader crash if replicated past commit point | replica lag acceptable behind commit point | stale leader blocked by fencing token |
P2 decrement counter | same as increment; retries need idempotency or explicit semantics | shard owner serializes competing updates | committed decrement survives crash if replicated past commit point | replica lag acceptable behind commit point | stale leader blocked by fencing token |
P3 route to shard owner | retry after refreshing shard map | only one valid owner should exist | if ownership changed, refreshed map points to new owner | n/a | stale owner rejected by epoch/fencing |
P4 reassign shard ownership | retry failover transition safely | only one reassignment wins | promoted owner crash triggers later reassignment | n/a | old owner fenced and must not serve updates |
P5 replicate committed updates | replica retries from last known index | followers independently catch up from committed stream | leader crash stops new replication until failover; committed log remains truth | missing replica append retried from commit index | stale replica cannot become owner unless caught up |
What matters most:
1. Idempotency for increment/decrement retries #
This is the key difference from ordinary overwrite systems.
Bad case:
- client sends
Increment - leader commits but response is lost
- client retries
- same logical operation increments twice
So for an interview-grade exact counter:
- you should call out
request_idor idempotency token support for mutation APIs
2. Stale leader protection #
Same as the KV store:
- old owner must not continue accepting counter updates after failover
3. Failover only to sufficiently current replica #
Otherwise exact value can go backward or lose committed updates.
Failure summary:
Exact distributed counters stay correct if:
- one shard owner serializes updates
- mutation retries are idempotent
- committed updates survive failover
- stale owners are fenced
Step 9 — Scale Adjustments #
| Hotspot | Type | First response |
|---|---|---|
| hot counter key | contention hotspot | isolate that key, shard it specially, or move to partial-counter variant if exact single-owner becomes bottleneck |
| hot shard leader | write throughput hotspot | increase shard count and rebalance keys across shards |
| replication lag | write throughput hotspot | batch mutation log replication, tune follower catch-up, and separate hot shards |
| strong-read pressure on leaders | read hotspot | add follower reads only if relaxed consistency is acceptable, otherwise add more shards |
| failover churn / leader movement | contention hotspot | stabilize leases/elections and avoid aggressive reassignment |
| analytics/dashboard reads | read hotspot | move to materialized views from snapshots/logs |
What scales well:
The exact baseline scales by:
- partitioning counter keys across shards
- keeping one authoritative writer per shard
- replicating per shard independently
So throughput grows mainly with:
- number of shards
- number of healthy leaders
- efficiency of log replication
What fails first:
Usually:
- one very hot counter
- one very hot shard
- retry/idempotency overhead on mutation path
- leader bottleneck for strong reads and writes
Canonical design conclusion:
The mechanical outcome is:
- primary state:
CounterStatePartitionOwnershipPartitionMapReplicaState
- critical invariants:
- one authoritative update order per counter key
- no lost or double-applied increments/decrements
- failover only to sufficiently current replica
- mechanisms:
single writer per partitionappend loglease- idempotency key for mutation retries
- reads:
- direct authoritative reads for exact values
- projections only for dashboards/analytics
Polished interview answer:
“I’d design the distributed counter service as a partitioned leader-based system. Each counter key maps to one shard, and that shard leader is the only authority allowed to serialize increments and decrements for that key. Mutations are appended to a replicated per-shard log and applied to current CounterState after commit, which gives exact arithmetic and safe failover. Reads for exact values go to the leader, and mutation APIs carry idempotency keys so retries don’t double-apply. The main scaling levers are more shards, hot-key isolation, and efficient replication.”
Canonical Interview Answer #
For a generic distributed counter service interview question, the canonical answer is usually:
- exact counter
- one authoritative owner per counter key or shard
- increments/decrements serialized by that owner
- replication via ordered mutation log
- strong reads from the owner
- idempotency key for retry-safe increment/decrement
So the canonical shape is:
- archetype:
Key-Scoped Mutable State - write path:
- increment/decrement ->
guarded state transition
- increment/decrement ->
- mechanism:
single writer per partitionappend logfor replicationleasefor shard ownership
A compact HLD answer:
- hash
counter_keyto a shard - shard leader is the only writer for counters in that shard
- leader appends increments/decrements to a replicated log
- committed updates are applied to
CounterState - reads go to leader for exact value
- failover promotes only a sufficiently caught-up replica
- client mutations carry
request_idto avoid double-apply on retry
Why this is canonical:
- simplest exact semantics
- no scatter-gather on reads
- no lost updates under concurrency
- clean failover story
If the interviewer pushes for very hot counters or higher availability, then mention the main variant:
- shard one logical counter into partial counters
- local increments on multiple owners
- read by summing partials with
scatter-gather - this is higher-throughput but no longer the simplest exact baseline
Scatter-Gather Clarification #
Scatter-gather is not needed in the exact single-owner baseline.
In this design, each counter_key has one authoritative shard owner, so a read for one counter is:
- route to shard
- read from that owner
No scatter-gather needed.
Scatter-gather appears only in different variants.
When you do not need scatter-gather #
- one key maps to one shard
- one shard has one authoritative owner
- read one counter value
That is the current model.
When you would need scatter-gather #
- a logical counter is sharded across many nodes
- example: hot counter split into 100 partial counters
- read requires summing partials from multiple shards
- or you allow local increments on many replicas and merge on read
Then the read path becomes:
scatter-gather
Examples:
- approximate high-throughput counters
- CRDT G-counter / PN-counter style systems
- intentionally sharded hot counters
So the distinction is:
exact single-owner counter
- no scatter-gather
- direct source read
distributed partial counter
- yes, scatter-gather or aggregation read
- often eventual or merge-based