Table of Contents

Distributed Rate Limiter Analysis Note #

This note captures the full step-by-step analysis from the infra interview recipe for a distributed rate limiter, including the concrete request-flow layer and the comparison between fixed-window and token-bucket limiter state.

Step 1 — Normalize #

Assume the baseline prompt is:

design a distributed rate limiter
clients/services call Allow(key)
enforce limits per key, for example user/IP/API key
scale across nodes
remain correct under concurrency

Normalize into state-affecting paths.

Requirement	Actor	Operation	State touched	Priority
Client checks whether request is allowed	Client	state transition	`S1` `update target` `RateLimitState`	C1
Client reads current limit policy	Client	read source	`S1` `read source target` `LimitPolicy`	R1
Admin updates limit policy	Admin	overwrite state	`S1` `update target` `LimitPolicy`	C1
System expires old window/bucket state	System	async process	`S1` `hidden write target` `RateLimitState`	C1
System routes key to rate-limit owner/shard	System	read source	`S1` `read source target` `PartitionMap`	C1
System reassigns shard/ownership after node failure	System	state transition	`S1` `update target` `PartitionOwnership`	C1
Client reads usage/status dashboard	Client	read projection	`S1` `read projection target` `UsageView`	R2

Notes on normalization:

Allow(key) is a state transition
- because it changes current limiter state:
  - counter
  - token balance
  - window/bucket cursor
policy update is overwrite state
expiration/cleanup is async process
shard routing and ownership are explicit because this is distributed infra

Likely C1:

allow/deny decision
policy updates
shard routing
ownership reassignment
state expiry/rotation

Likely R1:

read current policy

Step 2 — Critical Path Selection #

Requirement	Priority class	Why
Check whether request is allowed	C1	core correctness path; wrong decision breaks rate-limit contract
Read current limit policy	R1	important serving path for evaluation
Update limit policy	C1	wrong policy state changes all future decisions
Expire old window/bucket state	C1	stale buckets can incorrectly allow or deny
Route key to owner/shard	C1	wrong routing can split counters and break enforcement
Reassign shard after node failure	C1	failover must preserve correct ownership and state continuity
Read usage/status dashboard	R2	operational only

Baseline critical paths:

Main C1 paths:

P1 allow/deny decision
P2 update limit policy
P3 expire/rotate limiter state
P4 route key to shard owner
P5 reassign shard ownership

Main R1 path:

P6 read policy

This system is not primarily about dashboards or analytics. The correctness center is:

one authoritative limiter state per key/scope
one authoritative policy per key/scope
correct state transitions under concurrency
correct routing/ownership in a distributed deployment

Step 3 — Primary State Extraction #

For a distributed rate limiter, the minimal primary state is the current limiter state, policy state, and ownership/routing state.

Candidate object label	Candidate source	Candidate needed for C1/R1?	Candidate decomposition action	Class	Primary?	Owner	Evolution	Scope kind	Scope value
RateLimitState	hidden write target	Yes	keep as candidate	process	Yes	service	state machine	instance	limit_key
LimitPolicy	direct noun	Yes	keep as candidate	entity	Yes	service	overwrite	instance	limit_key or policy_scope
PartitionOwnership	hidden write target	Yes	keep as candidate	process	Yes	service	state machine	instance	shard_id
PartitionMap	hidden write target	Yes	keep as candidate	entity	Yes	service	overwrite	collection	keyspace shards
UsageView	derived read model	No	reject as UI artifact	projection	No	derived	overwrite	collection	policy_scope
CleanupCursor	hidden write target	No	split candidate	process	No	service	overwrite	collection	shard_id
RequestLog	hidden write target	No	reject as implementation choice	event	No	derived	append-only	collection	limit_key

Important modeling choices:

`RateLimitState` #

This is the main correctness object.

Depending on algorithm, it may contain:

fixed window:
- current window id
- count
sliding window:
- bucket counts / rolling cursor
token bucket:
- token count
- last refill timestamp

So this is naturally a process object with state transitions.

`LimitPolicy` #

Needed because:

allowed rate/period/burst is authoritative configuration
updates change future decisions

`PartitionOwnership` #

Needed because:

distributed correctness requires one authoritative owner per shard or scope

`PartitionMap` #

Needed because:

routing must consistently send the same key to the same shard/owner

Minimal strict primary set:

RateLimitState
LimitPolicy
PartitionOwnership
PartitionMap

Step 4 — Hard Invariants #

For a distributed rate limiter, the hard invariants are about one authoritative limiter state per key, correct state transition under policy, and exclusive shard ownership.

Path	Tier	Type	Invariant template	Invariant statement
`P1` `write path` `Allow/deny request`	HARD	eligibility	eligibility template	Action `allow_request` is valid only if `request is evaluated against current authoritative RateLimitState and current LimitPolicy for limit_key` at decision time.
`P1` `write path` `Allow/deny request`	HARD	ordering	ordering template	Instances `rate-limit state transitions for limit_key` are ordered by `authoritative evaluation order` within `limit_key`.
`P2` `write path` `Update policy`	HARD	ordering	ordering template	Instances `policy revisions` are ordered by `monotonic policy version` within `policy_scope`.
`P3` `write path` `Expire/rotate limiter state`	HARD	eligibility	eligibility template	Action `expire_or_rotate_state` is valid only if `current bucket/window/token state is stale relative to current time and current algorithm rules` on `limit_key` at decision time.
`P4` `write path` `Route to shard owner`	HARD	uniqueness	uniqueness template	Key `shard_id` maps to at most one logical outcome `current authoritative owner` within `shard_id`.
`P5` `write path` `Reassign shard ownership`	HARD	eligibility	eligibility template	Action `reassign_shard` is valid only if `current owner is failed or relinquished and new owner is eligible and sufficiently current` on `shard_id` at decision time.
`P6` `read path` `Read policy`	HARD	freshness	freshness template	Read path `policy lookup` reflects authoritative policy state within `configured policy consistency bound`.

What matters most:

1. One authoritative limiter state per key #

You cannot safely evaluate the same limit_key concurrently on multiple unsynchronized owners if you want strict enforcement.

2. Ordered evaluation per key #

For a given key, the limiter state must evolve in one authoritative order.

3. Policy/version correctness #

A request should be evaluated against the intended policy revision.

4. Exclusive ownership per shard #

This avoids split-brain counters or token balances.

Step 5 — Execution Context #

For the strict baseline distributed rate limiter:

Field	Value	Why
Topology	single service distributed	one logical rate-limit service spread across nodes
Write coordination scope	per object scope	correctness is per `limit_key` and per shard ownership scope
Read consistency target	strong only	safest baseline so `Allow` decisions use authoritative state and policy
Holder model	node	shard ownership is held by nodes
Compensation acceptable?	No	a wrongly allowed request cannot be undone for correctness purposes

Derived implications:

holder_may_crash = true
- nodes can fail while owning shards
cross_service_write = false
- baseline keeps policy, state, and ownership within one logical service
bounded_staleness_allowed = false
- strict baseline disallows stale policy/state on the hot decision path
cross_service_atomicity_required = false
- no multi-service transaction required in baseline
exclusive_claim_required = true
- shard ownership must be exclusive
guarded_by_current_state = true
- allow/deny and cleanup depend on current limiter state and time

This pushes us toward:

one authoritative writer per shard or per key partition
local ordered state transitions for RateLimitState
policy reads from authoritative or tightly synchronized state
lease-backed ownership for shards

Step 6 — Deterministic Mechanism Selection #

6A. Write Shape #

Path	Why	Write shape
`P1` allow/deny request	decision is valid only against current limiter state and policy	`guarded state transition`
`P2` update policy	replace current policy revision	`overwrite current value`
`P3` expire/rotate limiter state	valid only when time/state predicate says current bucket is stale	`guarded state transition`
`P4` route to shard owner	one current authoritative owner per shard	`exclusive claim`
`P5` reassign shard ownership	valid only if current owner failed/relinquished and candidate is eligible	`guarded state transition`

6B. Base Mechanism #

Path	Write shape	Base mechanism	Required companions
`P1` allow/deny request	`guarded state transition`	`single writer per partition`	monotonic time policy, policy version
`P2` update policy	`overwrite current value`	`CAS on version`	policy version
`P3` expire/rotate limiter state	`guarded state transition`	leader-applied guarded transition	monotonic time policy
`P4` route to shard owner	`exclusive claim`	`lease`	fencing token, heartbeat
`P5` reassign shard ownership	`guarded state transition`	`CAS on (state, version)`	fencing token, shard catch-up check

Why these fit:

Allow/deny #

If the system is strict, the key insight is:

do not evaluate one limit_key on multiple concurrent unsynchronized writers
send that key to one authoritative shard owner
that owner applies the limiter state machine in order

So the mechanism is best thought of as:

single writer per partition for the hot path

Policy update #

Policy is current configuration:

versioned overwrite is enough

Expire/rotate state #

This is time- and state-dependent:

cleanup/refill/window-rotation is a guarded transition on current state

Shard ownership #

Classic exclusive claim:

only one current owner should enforce keys in that shard

Reassignment #

Promotion/failover is valid only under guarded ownership conditions.

Canonical substrate implied:

partitioned rate limiter service
one leader/owner per shard
each owner evaluates and mutates limiter state for keys in that shard
policy is versioned config
shard ownership is lease-backed

Step 7 — Read Model / Source of Truth #

For a strict distributed rate limiter, truth is mostly direct source state.

Concept	Truth	Read path	Rebuild path
`C1` `source concept` `Current limiter state for key`	`RateLimitState` on authoritative shard owner	`read source directly`	authoritative shard state / replay from mutation log if used
`C2` `source concept` `Current policy`	`LimitPolicy`	`read source directly`	authoritative policy store
`C3` `source concept` `Shard ownership`	`PartitionOwnership`	`read source directly`	authoritative ownership store
`C4` `source concept` `Shard routing map`	`PartitionMap`	`read source directly`	authoritative routing metadata
`C5` `projection concept` `Usage / dashboard view`	derived from state transitions	`materialized view`	recompute from state snapshots and logs

Important point:

For strict enforcement:

Allow(key) should read and update authoritative limiter state on the current shard owner
policy lookup should be authoritative or very tightly synchronized
ownership/routing must be authoritative

Only secondary views like:

usage dashboards
analytics
top abusers
per-tenant reports

should be projections.

Optional variation:

If the interviewer allows approximate or eventually consistent limiting:

policy might still remain authoritative
but limiter state may be partially local/cached/merged

That would be a different design branch.

For the strict baseline, source-of-truth reads dominate.

Step 8 — Failure Handling #

Path	Retry	Competing writers	Crash after commit	Publish failure	Stale holder
`P1` allow/deny request	client retries may produce a new evaluation unless request-id semantics are added	shard owner serializes competing requests for same key/scope	committed limiter-state transition survives if durably replicated/logged; otherwise request may need fail-safe deny policy	n/a	stale shard owner fenced by lease epoch
`P2` update policy	retry with version precondition	stale policy write loses CAS	committed policy survives crash if persisted	policy propagation may lag to dashboards, not to authoritative reads	n/a
`P3` expire/rotate limiter state	cleanup retry is safe if current time/state predicate still holds	duplicate cleanup transitions collapse under guarded state check	crash delays cleanup, next run retries	n/a	stale owner cannot apply rotation if fenced
`P4` route to shard owner	retry after refreshing partition map	only one valid owner should exist	if ownership changed, refreshed map points to new owner	n/a	stale owner rejected by fencing token
`P5` reassign shard ownership	retry failover transition safely	only one reassignment wins current ownership state	promoted owner crash triggers later reassignment	n/a	old owner must stop serving after fencing

What matters most:

1. Stale owner protection #

This is the key safety issue.

Bad case:

old shard owner still accepts requests
new owner also accepts requests
same key gets evaluated on both sides
limit contract breaks

So:

ownership must be lease-backed
stale owners must be fenced by epoch/token

2. Retry semantics on `Allow` #

If a client retries Allow(key):

the system may evaluate it twice unless request-id semantics are explicit
whether that is acceptable depends on API contract

Strict baseline options:

either document that Allow is not idempotent by default
or add request-id dedup if interviewer asks

3. Crash after decision #

If the node decides allow and crashes before durable state is preserved:

you risk undercounting
a safe design may prefer:
- durable commit before positive allow
- or fail-closed on uncertainty

This is an important tradeoff to say out loud.

Failure summary:

The rate limiter stays correct if:

one shard owner evaluates each key
policy and limiter state are authoritative on that owner
stale owners are fenced
retries and crash behavior are explicitly defined

Step 9 — Scale Adjustments #

Hotspot	Type	First response
hot keys or tenants	contention hotspot	isolate hot keys, increase shard count, or special-case dedicated partitions
shard-owner CPU saturation	write throughput hotspot	add more shards and rebalance ownership
policy lookup overhead	read hotspot	cache policy per shard owner with authoritative version checks
cleanup / refill scans	write throughput hotspot	bucketize expiry/refill state and process incrementally
ownership churn during failures	contention hotspot	stabilize leases and avoid over-eager reassignment
dashboard/reporting load	read hotspot	move usage/reporting to derived views only

What scales well:

A strict distributed rate limiter scales if:

keys are partitioned well
each key has one authoritative shard owner
the hot path stays local to the shard owner
policy is cached with version discipline

Throughput grows with:

number of shards
balance of hot keys across shards
efficiency of local limiter-state evaluation

What fails first if designed badly:

one global counter service
no partitioning by key
multiple owners evaluating the same key concurrently
policy fetched remotely on every request
full scans for expiry/rotation on every decision

Canonical design conclusion:

primary state:
- RateLimitState
- LimitPolicy
- PartitionOwnership
- PartitionMap
critical invariants:
- one authoritative limiter state per key
- one authoritative owner per shard
- ordered state transitions per key
- policy version correctness
mechanisms:
- single writer per partition for the hot Allow path
- CAS on version for policy updates
- lease for shard ownership
- guarded transitions for refill/rotation/failover
reads:
- direct source reads for policy/state on the hot path
- projections only for dashboards

Polished interview answer:

“I’d design the distributed rate limiter as a partitioned service where each key is routed to one authoritative shard owner. That owner holds the current limiter state for the key and applies allow/deny transitions in a single ordered sequence, which avoids split counters under concurrency. Policy is versioned configuration, shard ownership is lease-backed and fenced, and failover is a guarded ownership transition. The hot path is local to the shard owner, while dashboards and usage views are derived projections. The main scaling levers are more shards, hot-key isolation, and keeping policy lookups versioned but local to the owner.”

Concrete Substrate #

I’ll choose a service-owned shard-leader rate limiter rather than “just Redis,” because it matches the mechanics we derived:

single writer per partition
lease for shard ownership
guarded state transitions for Allow
versioned policy

Concrete substrate:

clients call a rate limiter service
keys are hashed to shards
each shard has one leader/owner
that leader keeps RateLimitState for its keys in memory for low latency
state is durably replicated or logged depending on strictness requirement
policy is stored in a small config store and cached on shard leaders
shard ownership is tracked by a small coordination layer

Concrete tech family:

service in Go or Java
shard ownership via etcd leases or built-in Raft metadata
local state in memory
optional durability/replication:
- local WAL or append log
- or replicated state machine per shard if strict crash correctness is required
policy store:
- etcd / config DB / internal metadata store

This is more infra-native than “Redis INCR,” while still concrete.

Operation Layer #

1. Allow / deny request #

API

Allow(limit_key, request_id?, now?)

Initiator

client/service

Entry point

gateway or any rate-limiter node

Authoritative decider

current shard leader for shard(limit_key)

Precondition

request routed to current shard owner
current LimitPolicy version available
limiter state valid for limit_key

Transition Depends on algorithm, but in general:

read LimitPolicy
read current RateLimitState
rotate/refill if needed
if budget available:
- decrement token / increment counter / consume slot
- return allowed
else:
- leave state at denied boundary or update denial metadata
- return denied

Response

{allowed, remaining, reset_at, policy_version}

Failure cases

stale routing -> retry with updated shard map
shard leader unavailable -> retry or fail closed/open based on product policy
duplicate retries may double-consume unless request-id dedup is added

Sequence

client sends Allow(limit_key)
entry node computes shard
request forwarded to shard leader
leader loads cached policy for that key scope
leader loads in-memory RateLimitState
leader applies refill/window-rotation if needed
leader evaluates allow/deny
leader updates state
leader replies with decision

2. Read policy #

API

GetPolicy(policy_scope)

Initiator

client/admin/internal shard leader

Entry point

any node or config endpoint

Authoritative decider

policy/config store

Precondition

none

Transition

none

Response

{limit, window, burst, version}

Failure cases

stale cache must be rejected or refreshed if version mismatch detected

Sequence

caller requests policy
config service serves current policy version
shard leader caches it with version

3. Update policy #

API

PutPolicy(policy_scope, config, expected_version?)

Initiator

admin

Entry point

admin/config endpoint

Authoritative decider

policy/config store leader

Precondition

if optimistic update, expected version matches

Transition

overwrite LimitPolicy
bump version
trigger invalidation/refresh to shard leaders

Response

{policy_version}

Failure cases

version conflict
propagation lag to non-authoritative caches

Sequence

admin sends policy update
config leader validates version
commits new policy
policy version increments
shard leaders refresh or receive invalidation

4. Expire / rotate limiter state #

API

internal, usually not exposed
RotateOrRefill(limit_key, now) or batched sweep

Initiator

system / shard leader

Entry point

shard leader

Authoritative decider

shard leader

Precondition

current time and state indicate refill/window boundary crossed

Transition

refill tokens
advance window bucket
clear stale bucket state

Response

internal success

Failure cases

duplicate rotation attempts collapse under current-state checks

Sequence

leader handles request or periodic sweep
compares now with current state timestamps
applies refill/window advancement
continues with allow decision or cleanup

5. Route key to shard owner #

API

internal lookup:
- ResolveShard(limit_key)

Initiator

entry node / gateway

Entry point

local routing layer

Authoritative decider

authoritative PartitionMap

Precondition

partition map reasonably current

Transition

none

Response

{shard_id, leader_node}

Failure cases

stale map leads to redirect/retry

Sequence

entry node hashes key
looks up shard in partition map
forwards request to current leader

6. Reassign shard ownership #

API

internal failover flow:
- AcquireShardLease(shard_id)
- or leader election RPCs

Initiator

system

Entry point

coordination layer / candidate shard node

Authoritative decider

lease/coordination store or shard quorum

Precondition

current owner failed/relinquished
new owner eligible and current enough

Transition

PartitionOwnership(shard_id) moves to new leader with new epoch
old owner fenced

Response

updated shard owner in routing metadata

Failure cases

split ownership attempt
stale previous owner resumes and must be rejected by epoch

Sequence

leader fails or loses lease
candidate acquires shard lease
routing metadata updates with new leader/epoch
new leader serves requests
stale old leader is fenced

Entry Point vs Decider vs Responder #

Path	Entry point	Authoritative decider	Physical responder	Logical responder
`Allow`	gateway / any node	shard leader	shard leader or front node	rate limiter service
`GetPolicy`	config endpoint / any node	policy store	config node	rate limiter service
`PutPolicy`	admin endpoint	policy store leader	config leader	rate limiter service
`ResolveShard`	entry node	partition map source	entry node	rate limiter service
shard failover	candidate node / coordination layer	lease store or shard quorum	new owner / coordination layer	rate limiter service

Concrete HLD #

Main components:

client gateway / router
- hashes limit_key -> shard
- forwards to current shard leader
shard leader
- authoritative evaluator for keys in that shard
- holds RateLimitState in memory
- applies allow/deny transitions
shard followers / replicas (if strict durability required)
- receive replicated state/log
policy/config service
- stores LimitPolicy
- versions policy updates
coordination layer
- stores shard ownership lease
- fences stale leaders

Concrete Technology Realizations #

Stronger infra-native answer #

Go or Java rate limiter service
etcd for shard leases and routing metadata
in-memory per-shard limiter state
policy in etcd or internal config DB
optional WAL / replicated log for crash-safe strictness

Simpler pragmatic answer #

Redis Cluster
shard by key
use Lua script on authoritative Redis shard for atomic allow/deny evaluation
policy in Redis hash or config service
still need external shard ownership / routing discipline if you truly want distributed failover semantics

That second answer is practical, but the first one is better aligned with the mechanical derivation.

Short interview version:

“I’d implement the strict distributed rate limiter as a sharded service. Each key hashes to one shard leader, and that leader is the only authority allowed to evaluate and mutate limiter state for that key. Policy is versioned config stored separately, shard ownership is lease-backed with fencing, and the hot path runs against in-memory state on the shard leader for low latency. If stronger crash safety is required, I’d add a per-shard replicated log or WAL before returning allow decisions.”

Why There Is No Mutation Log By Default #

A rate limiter often does not need an authoritative append-only history of every decision.

The main truth is usually:

current limiter state for a key
not the full sequence of all past allows/denies

So mechanically, this system is centered on:

RateLimitState
LimitPolicy

not on an append-only event stream.

Why no mutation log by default #

A mutation log is useful when you need one or more of:

durable replay as primary recovery mechanism
audit trail
downstream consumers of every event
replication via ordered event shipping
exact historical reconstruction

For a basic distributed rate limiter, you often do not need all that. You only need the current state to answer:

allow or deny now?

So the baseline can be:

in-memory authoritative state per shard
maybe periodic checkpointing
maybe replication of current state
no full decision log

When you would add a mutation log #

You would add one if the prompt or requirements demand:

strict crash recovery with no undercount after node failure
exact replay/reconstruction of limiter state
durable replicated state-machine semantics
auditability of rate-limit decisions
analytics derived from exact decision stream

Then the design shifts toward:

append log of ALLOW_CONSUME, DENY, REFILL, ROTATE
rebuild RateLimitState from the log or snapshots

Replication without log-first modeling #

Replication can still exist. It is just not necessarily log-based replication in the baseline answer.

Three levels:

No replication
- acceptable only for a weak/best-effort limiter
- simple, but loses state on node failure
State replication
- replicate current RateLimitState or shard snapshots to a standby/follower
- enough for many practical limiters
Log replication
- replicate every limiter mutation in order
- strongest, but more expensive

So:

replication? often yes
mutation log as primary modeling object? not always necessary unless the prompt pushes for strong durability/replay semantics

Token Bucket vs Fixed Window #

This mostly changes the shape of RateLimitState and the exact allow_request transition. The rest of the distributed architecture can stay similar.

Fixed Window #

`RateLimitState` #

For key K:

window_id
count

Example:

100 requests per minute
window_id = floor(now / 60s)
count = 37

`Allow` transition #

compute current window_id
if window_id != stored window_id
- reset state to new window
- count = 0
if count < limit
- increment count
- allow
else
- deny

Pros #

very simple
cheap state
easy to shard

Cons #

boundary burst problem
user can consume near end of one window and start of next

Write-shape interpretation #

Still:

guarded state transition because allow is valid only if count < limit in current window

Sliding Window Log #

`RateLimitState` #

For key K:

timestamps of recent requests
or bucketed counts across rolling subwindows

Example:

deque/ring of timestamps
or 12 buckets for last 60s

`Allow` transition #

drop entries outside rolling window
count recent requests
if count < limit
- add current request
- allow
else
- deny

Pros #

more accurate than fixed window

Cons #

larger state
more expensive cleanup

Write-shape #

Still:

guarded state transition

Token Bucket #

`RateLimitState` #

For key K:

tokens
last_refill_ts

Example:

capacity = 100
refill_rate = 10 tokens/sec

`Allow` transition #

compute elapsed time since last_refill_ts
refill:
- tokens = min(capacity, tokens + elapsed * refill_rate)
set last_refill_ts = now
if tokens >= cost
- tokens -= cost
- allow
else
- deny

Pros #

smooths traffic
supports burst + steady refill naturally
common practical choice

Cons #

needs careful time handling
clock monotonicity matters more

Write-shape #

Still:

guarded state transition

The guard is:

refilled_tokens >= request_cost

Leaky Bucket #

`RateLimitState` #

queue depth or backlog
last drain timestamp

This is similar to token bucket but models drainage instead of refill.

Main Difference In The Framework #

The distributed system structure mostly stays the same:

shard key to one owner
authoritative RateLimitState
versioned LimitPolicy
lease-backed shard ownership

What changes is:

exact fields in RateLimitState
exact guard in allow_request
cleanup/rotation behavior

Comparison #

Algorithm	State	Guard for allow	Cleanup behavior
Fixed window	`window_id, count`	`count < limit`	reset on new window
Sliding window	timestamps or rolling buckets	`recent_count < limit`	evict old entries/buckets
Token bucket	`tokens, last_refill_ts`	`refilled_tokens >= cost`	refill on read/write
Leaky bucket	backlog, last_drain_ts	backlog below threshold	drain over time

Interview guidance:

If prompt does not specify, I’d usually choose:

token bucket if:
want burst support
smoother behavior
realistic API/infra answer

Choose:

fixed window if:
interviewer wants simplest baseline
focus is distributed ownership/routing, not algorithm nuance

A short interview line:

“The distributed architecture doesn’t change much across algorithms; what changes is the state machine stored in RateLimitState. Fixed window stores (window_id, count), while token bucket stores (tokens, last_refill_ts). In both cases, Allow is still a guarded state transition applied by the authoritative shard owner.”

Distributed Rate Limiter Analysis Note #

Step 1 — Normalize #

Step 2 — Critical Path Selection #

Step 3 — Primary State Extraction #

RateLimitState #

LimitPolicy #

PartitionOwnership #

PartitionMap #

Step 4 — Hard Invariants #

1. One authoritative limiter state per key #

2. Ordered evaluation per key #

3. Policy/version correctness #

4. Exclusive ownership per shard #

Step 5 — Execution Context #

Step 6 — Deterministic Mechanism Selection #

6A. Write Shape #

6B. Base Mechanism #

Allow/deny #

Policy update #

Expire/rotate state #

Shard ownership #

Reassignment #

Step 7 — Read Model / Source of Truth #

Step 8 — Failure Handling #

1. Stale owner protection #

2. Retry semantics on Allow #

3. Crash after decision #

Step 9 — Scale Adjustments #

Concrete Substrate #

Operation Layer #

1. Allow / deny request #

2. Read policy #

3. Update policy #

4. Expire / rotate limiter state #

5. Route key to shard owner #

6. Reassign shard ownership #

Entry Point vs Decider vs Responder #

Concrete HLD #

Concrete Technology Realizations #

Stronger infra-native answer #

Simpler pragmatic answer #

Why There Is No Mutation Log By Default #

Why no mutation log by default #

When you would add a mutation log #

Replication without log-first modeling #

Token Bucket vs Fixed Window #

Fixed Window #

RateLimitState #

Allow transition #

Pros #

Cons #

Write-shape interpretation #

Sliding Window Log #

RateLimitState #

Allow transition #

Pros #

Cons #

Write-shape #

Token Bucket #

RateLimitState #

Allow transition #

Pros #

Cons #

Write-shape #

Leaky Bucket #

RateLimitState #

Main Difference In The Framework #

Comparison #

`RateLimitState` #

`LimitPolicy` #

`PartitionOwnership` #

`PartitionMap` #

2. Retry semantics on `Allow` #

`RateLimitState` #

`Allow` transition #

`RateLimitState` #

`Allow` transition #

`RateLimitState` #

`Allow` transition #

`RateLimitState` #