Distributed Rate Limiter Analysis Note
Table of Contents
Distributed Rate Limiter Analysis Note #
This note captures the full step-by-step analysis from the infra interview recipe for a distributed rate limiter, including the concrete request-flow layer and the comparison between fixed-window and token-bucket limiter state.
Step 1 — Normalize #
Assume the baseline prompt is:
- design a distributed rate limiter
- clients/services call
Allow(key) - enforce limits per key, for example user/IP/API key
- scale across nodes
- remain correct under concurrency
Normalize into state-affecting paths.
| Requirement | Actor | Operation | State touched | Priority |
|---|---|---|---|---|
| Client checks whether request is allowed | Client | state transition | S1update targetRateLimitState | C1 |
| Client reads current limit policy | Client | read source | S1read source targetLimitPolicy | R1 |
| Admin updates limit policy | Admin | overwrite state | S1update targetLimitPolicy | C1 |
| System expires old window/bucket state | System | async process | S1hidden write targetRateLimitState | C1 |
| System routes key to rate-limit owner/shard | System | read source | S1read source targetPartitionMap | C1 |
| System reassigns shard/ownership after node failure | System | state transition | S1update targetPartitionOwnership | C1 |
| Client reads usage/status dashboard | Client | read projection | S1read projection targetUsageView | R2 |
Notes on normalization:
Allow(key)is astate transition- because it changes current limiter state:
- counter
- token balance
- window/bucket cursor
- because it changes current limiter state:
- policy update is
overwrite state - expiration/cleanup is
async process - shard routing and ownership are explicit because this is distributed infra
Likely C1:
- allow/deny decision
- policy updates
- shard routing
- ownership reassignment
- state expiry/rotation
Likely R1:
- read current policy
Step 2 — Critical Path Selection #
| Requirement | Priority class | Why |
|---|---|---|
| Check whether request is allowed | C1 | core correctness path; wrong decision breaks rate-limit contract |
| Read current limit policy | R1 | important serving path for evaluation |
| Update limit policy | C1 | wrong policy state changes all future decisions |
| Expire old window/bucket state | C1 | stale buckets can incorrectly allow or deny |
| Route key to owner/shard | C1 | wrong routing can split counters and break enforcement |
| Reassign shard after node failure | C1 | failover must preserve correct ownership and state continuity |
| Read usage/status dashboard | R2 | operational only |
Baseline critical paths:
Main C1 paths:
P1allow/deny decisionP2update limit policyP3expire/rotate limiter stateP4route key to shard ownerP5reassign shard ownership
Main R1 path:
P6read policy
This system is not primarily about dashboards or analytics. The correctness center is:
- one authoritative limiter state per key/scope
- one authoritative policy per key/scope
- correct state transitions under concurrency
- correct routing/ownership in a distributed deployment
Step 3 — Primary State Extraction #
For a distributed rate limiter, the minimal primary state is the current limiter state, policy state, and ownership/routing state.
| Candidate object label | Candidate source | Candidate needed for C1/R1? | Candidate decomposition action | Class | Primary? | Owner | Evolution | Scope kind | Scope value |
|---|---|---|---|---|---|---|---|---|---|
| RateLimitState | hidden write target | Yes | keep as candidate | process | Yes | service | state machine | instance | limit_key |
| LimitPolicy | direct noun | Yes | keep as candidate | entity | Yes | service | overwrite | instance | limit_key or policy_scope |
| PartitionOwnership | hidden write target | Yes | keep as candidate | process | Yes | service | state machine | instance | shard_id |
| PartitionMap | hidden write target | Yes | keep as candidate | entity | Yes | service | overwrite | collection | keyspace shards |
| UsageView | derived read model | No | reject as UI artifact | projection | No | derived | overwrite | collection | policy_scope |
| CleanupCursor | hidden write target | No | split candidate | process | No | service | overwrite | collection | shard_id |
| RequestLog | hidden write target | No | reject as implementation choice | event | No | derived | append-only | collection | limit_key |
Important modeling choices:
RateLimitState #
This is the main correctness object.
Depending on algorithm, it may contain:
- fixed window:
- current window id
- count
- sliding window:
- bucket counts / rolling cursor
- token bucket:
- token count
- last refill timestamp
So this is naturally a process object with state transitions.
LimitPolicy #
Needed because:
- allowed rate/period/burst is authoritative configuration
- updates change future decisions
PartitionOwnership #
Needed because:
- distributed correctness requires one authoritative owner per shard or scope
PartitionMap #
Needed because:
- routing must consistently send the same key to the same shard/owner
Minimal strict primary set:
RateLimitStateLimitPolicyPartitionOwnershipPartitionMap
Step 4 — Hard Invariants #
For a distributed rate limiter, the hard invariants are about one authoritative limiter state per key, correct state transition under policy, and exclusive shard ownership.
| Path | Tier | Type | Invariant template | Invariant statement |
|---|---|---|---|---|
P1write pathAllow/deny request | HARD | eligibility | eligibility template | Action allow_request is valid only if request is evaluated against current authoritative RateLimitState and current LimitPolicy for limit_key at decision time. |
P1write pathAllow/deny request | HARD | ordering | ordering template | Instances rate-limit state transitions for limit_key are ordered by authoritative evaluation order within limit_key. |
P2write pathUpdate policy | HARD | ordering | ordering template | Instances policy revisions are ordered by monotonic policy version within policy_scope. |
P3write pathExpire/rotate limiter state | HARD | eligibility | eligibility template | Action expire_or_rotate_state is valid only if current bucket/window/token state is stale relative to current time and current algorithm rules on limit_key at decision time. |
P4write pathRoute to shard owner | HARD | uniqueness | uniqueness template | Key shard_id maps to at most one logical outcome current authoritative owner within shard_id. |
P5write pathReassign shard ownership | HARD | eligibility | eligibility template | Action reassign_shard is valid only if current owner is failed or relinquished and new owner is eligible and sufficiently current on shard_id at decision time. |
P6read pathRead policy | HARD | freshness | freshness template | Read path policy lookup reflects authoritative policy state within configured policy consistency bound. |
What matters most:
1. One authoritative limiter state per key #
You cannot safely evaluate the same limit_key concurrently on multiple unsynchronized owners if you want strict enforcement.
2. Ordered evaluation per key #
For a given key, the limiter state must evolve in one authoritative order.
3. Policy/version correctness #
A request should be evaluated against the intended policy revision.
4. Exclusive ownership per shard #
This avoids split-brain counters or token balances.
Step 5 — Execution Context #
For the strict baseline distributed rate limiter:
| Field | Value | Why |
|---|---|---|
| Topology | single service distributed | one logical rate-limit service spread across nodes |
| Write coordination scope | per object scope | correctness is per limit_key and per shard ownership scope |
| Read consistency target | strong only | safest baseline so Allow decisions use authoritative state and policy |
| Holder model | node | shard ownership is held by nodes |
| Compensation acceptable? | No | a wrongly allowed request cannot be undone for correctness purposes |
Derived implications:
holder_may_crash = true- nodes can fail while owning shards
cross_service_write = false- baseline keeps policy, state, and ownership within one logical service
bounded_staleness_allowed = false- strict baseline disallows stale policy/state on the hot decision path
cross_service_atomicity_required = false- no multi-service transaction required in baseline
exclusive_claim_required = true- shard ownership must be exclusive
guarded_by_current_state = true- allow/deny and cleanup depend on current limiter state and time
This pushes us toward:
- one authoritative writer per shard or per key partition
- local ordered state transitions for
RateLimitState - policy reads from authoritative or tightly synchronized state
- lease-backed ownership for shards
Step 6 — Deterministic Mechanism Selection #
6A. Write Shape #
| Path | Why | Write shape |
|---|---|---|
P1 allow/deny request | decision is valid only against current limiter state and policy | guarded state transition |
P2 update policy | replace current policy revision | overwrite current value |
P3 expire/rotate limiter state | valid only when time/state predicate says current bucket is stale | guarded state transition |
P4 route to shard owner | one current authoritative owner per shard | exclusive claim |
P5 reassign shard ownership | valid only if current owner failed/relinquished and candidate is eligible | guarded state transition |
6B. Base Mechanism #
| Path | Write shape | Base mechanism | Required companions |
|---|---|---|---|
P1 allow/deny request | guarded state transition | single writer per partition | monotonic time policy, policy version |
P2 update policy | overwrite current value | CAS on version | policy version |
P3 expire/rotate limiter state | guarded state transition | leader-applied guarded transition | monotonic time policy |
P4 route to shard owner | exclusive claim | lease | fencing token, heartbeat |
P5 reassign shard ownership | guarded state transition | CAS on (state, version) | fencing token, shard catch-up check |
Why these fit:
Allow/deny #
If the system is strict, the key insight is:
- do not evaluate one
limit_keyon multiple concurrent unsynchronized writers - send that key to one authoritative shard owner
- that owner applies the limiter state machine in order
So the mechanism is best thought of as:
single writer per partitionfor the hot path
Policy update #
Policy is current configuration:
- versioned overwrite is enough
Expire/rotate state #
This is time- and state-dependent:
- cleanup/refill/window-rotation is a guarded transition on current state
Shard ownership #
Classic exclusive claim:
- only one current owner should enforce keys in that shard
Reassignment #
Promotion/failover is valid only under guarded ownership conditions.
Canonical substrate implied:
- partitioned rate limiter service
- one leader/owner per shard
- each owner evaluates and mutates limiter state for keys in that shard
- policy is versioned config
- shard ownership is lease-backed
Step 7 — Read Model / Source of Truth #
For a strict distributed rate limiter, truth is mostly direct source state.
| Concept | Truth | Read path | Rebuild path |
|---|---|---|---|
C1source conceptCurrent limiter state for key | RateLimitState on authoritative shard owner | read source directly | authoritative shard state / replay from mutation log if used |
C2source conceptCurrent policy | LimitPolicy | read source directly | authoritative policy store |
C3source conceptShard ownership | PartitionOwnership | read source directly | authoritative ownership store |
C4source conceptShard routing map | PartitionMap | read source directly | authoritative routing metadata |
C5projection conceptUsage / dashboard view | derived from state transitions | materialized view | recompute from state snapshots and logs |
Important point:
For strict enforcement:
Allow(key)should read and update authoritative limiter state on the current shard owner- policy lookup should be authoritative or very tightly synchronized
- ownership/routing must be authoritative
Only secondary views like:
- usage dashboards
- analytics
- top abusers
- per-tenant reports
should be projections.
Optional variation:
If the interviewer allows approximate or eventually consistent limiting:
- policy might still remain authoritative
- but limiter state may be partially local/cached/merged
That would be a different design branch.
For the strict baseline, source-of-truth reads dominate.
Step 8 — Failure Handling #
| Path | Retry | Competing writers | Crash after commit | Publish failure | Stale holder |
|---|---|---|---|---|---|
P1 allow/deny request | client retries may produce a new evaluation unless request-id semantics are added | shard owner serializes competing requests for same key/scope | committed limiter-state transition survives if durably replicated/logged; otherwise request may need fail-safe deny policy | n/a | stale shard owner fenced by lease epoch |
P2 update policy | retry with version precondition | stale policy write loses CAS | committed policy survives crash if persisted | policy propagation may lag to dashboards, not to authoritative reads | n/a |
P3 expire/rotate limiter state | cleanup retry is safe if current time/state predicate still holds | duplicate cleanup transitions collapse under guarded state check | crash delays cleanup, next run retries | n/a | stale owner cannot apply rotation if fenced |
P4 route to shard owner | retry after refreshing partition map | only one valid owner should exist | if ownership changed, refreshed map points to new owner | n/a | stale owner rejected by fencing token |
P5 reassign shard ownership | retry failover transition safely | only one reassignment wins current ownership state | promoted owner crash triggers later reassignment | n/a | old owner must stop serving after fencing |
What matters most:
1. Stale owner protection #
This is the key safety issue.
Bad case:
- old shard owner still accepts requests
- new owner also accepts requests
- same key gets evaluated on both sides
- limit contract breaks
So:
- ownership must be lease-backed
- stale owners must be fenced by epoch/token
2. Retry semantics on Allow #
If a client retries Allow(key):
- the system may evaluate it twice unless request-id semantics are explicit
- whether that is acceptable depends on API contract
Strict baseline options:
- either document that
Allowis not idempotent by default - or add request-id dedup if interviewer asks
3. Crash after decision #
If the node decides allow and crashes before durable state is preserved:
- you risk undercounting
- a safe design may prefer:
- durable commit before positive allow
- or fail-closed on uncertainty
This is an important tradeoff to say out loud.
Failure summary:
The rate limiter stays correct if:
- one shard owner evaluates each key
- policy and limiter state are authoritative on that owner
- stale owners are fenced
- retries and crash behavior are explicitly defined
Step 9 — Scale Adjustments #
| Hotspot | Type | First response |
|---|---|---|
| hot keys or tenants | contention hotspot | isolate hot keys, increase shard count, or special-case dedicated partitions |
| shard-owner CPU saturation | write throughput hotspot | add more shards and rebalance ownership |
| policy lookup overhead | read hotspot | cache policy per shard owner with authoritative version checks |
| cleanup / refill scans | write throughput hotspot | bucketize expiry/refill state and process incrementally |
| ownership churn during failures | contention hotspot | stabilize leases and avoid over-eager reassignment |
| dashboard/reporting load | read hotspot | move usage/reporting to derived views only |
What scales well:
A strict distributed rate limiter scales if:
- keys are partitioned well
- each key has one authoritative shard owner
- the hot path stays local to the shard owner
- policy is cached with version discipline
Throughput grows with:
- number of shards
- balance of hot keys across shards
- efficiency of local limiter-state evaluation
What fails first if designed badly:
- one global counter service
- no partitioning by key
- multiple owners evaluating the same key concurrently
- policy fetched remotely on every request
- full scans for expiry/rotation on every decision
Canonical design conclusion:
- primary state:
RateLimitStateLimitPolicyPartitionOwnershipPartitionMap
- critical invariants:
- one authoritative limiter state per key
- one authoritative owner per shard
- ordered state transitions per key
- policy version correctness
- mechanisms:
single writer per partitionfor the hotAllowpathCAS on versionfor policy updatesleasefor shard ownership- guarded transitions for refill/rotation/failover
- reads:
- direct source reads for policy/state on the hot path
- projections only for dashboards
Polished interview answer:
“I’d design the distributed rate limiter as a partitioned service where each key is routed to one authoritative shard owner. That owner holds the current limiter state for the key and applies allow/deny transitions in a single ordered sequence, which avoids split counters under concurrency. Policy is versioned configuration, shard ownership is lease-backed and fenced, and failover is a guarded ownership transition. The hot path is local to the shard owner, while dashboards and usage views are derived projections. The main scaling levers are more shards, hot-key isolation, and keeping policy lookups versioned but local to the owner.”
Concrete Substrate #
I’ll choose a service-owned shard-leader rate limiter rather than “just Redis,” because it matches the mechanics we derived:
single writer per partitionleasefor shard ownership- guarded state transitions for
Allow - versioned policy
Concrete substrate:
- clients call a rate limiter service
- keys are hashed to shards
- each shard has one leader/owner
- that leader keeps
RateLimitStatefor its keys in memory for low latency - state is durably replicated or logged depending on strictness requirement
- policy is stored in a small config store and cached on shard leaders
- shard ownership is tracked by a small coordination layer
Concrete tech family:
- service in Go or Java
- shard ownership via etcd leases or built-in Raft metadata
- local state in memory
- optional durability/replication:
- local WAL or append log
- or replicated state machine per shard if strict crash correctness is required
- policy store:
- etcd / config DB / internal metadata store
This is more infra-native than “Redis INCR,” while still concrete.
Operation Layer #
1. Allow / deny request #
API
Allow(limit_key, request_id?, now?)
Initiator
- client/service
Entry point
- gateway or any rate-limiter node
Authoritative decider
- current shard leader for
shard(limit_key)
Precondition
- request routed to current shard owner
- current
LimitPolicyversion available - limiter state valid for
limit_key
Transition Depends on algorithm, but in general:
- read
LimitPolicy - read current
RateLimitState - rotate/refill if needed
- if budget available:
- decrement token / increment counter / consume slot
- return
allowed
- else:
- leave state at denied boundary or update denial metadata
- return
denied
Response
{allowed, remaining, reset_at, policy_version}
Failure cases
- stale routing -> retry with updated shard map
- shard leader unavailable -> retry or fail closed/open based on product policy
- duplicate retries may double-consume unless request-id dedup is added
Sequence
- client sends
Allow(limit_key) - entry node computes shard
- request forwarded to shard leader
- leader loads cached policy for that key scope
- leader loads in-memory
RateLimitState - leader applies refill/window-rotation if needed
- leader evaluates allow/deny
- leader updates state
- leader replies with decision
2. Read policy #
API
GetPolicy(policy_scope)
Initiator
- client/admin/internal shard leader
Entry point
- any node or config endpoint
Authoritative decider
- policy/config store
Precondition
- none
Transition
- none
Response
{limit, window, burst, version}
Failure cases
- stale cache must be rejected or refreshed if version mismatch detected
Sequence
- caller requests policy
- config service serves current policy version
- shard leader caches it with version
3. Update policy #
API
PutPolicy(policy_scope, config, expected_version?)
Initiator
- admin
Entry point
- admin/config endpoint
Authoritative decider
- policy/config store leader
Precondition
- if optimistic update, expected version matches
Transition
- overwrite
LimitPolicy - bump version
- trigger invalidation/refresh to shard leaders
Response
{policy_version}
Failure cases
- version conflict
- propagation lag to non-authoritative caches
Sequence
- admin sends policy update
- config leader validates version
- commits new policy
- policy version increments
- shard leaders refresh or receive invalidation
4. Expire / rotate limiter state #
API
- internal, usually not exposed
RotateOrRefill(limit_key, now)or batched sweep
Initiator
- system / shard leader
Entry point
- shard leader
Authoritative decider
- shard leader
Precondition
- current time and state indicate refill/window boundary crossed
Transition
- refill tokens
- advance window bucket
- clear stale bucket state
Response
- internal success
Failure cases
- duplicate rotation attempts collapse under current-state checks
Sequence
- leader handles request or periodic sweep
- compares
nowwith current state timestamps - applies refill/window advancement
- continues with allow decision or cleanup
5. Route key to shard owner #
API
- internal lookup:
ResolveShard(limit_key)
Initiator
- entry node / gateway
Entry point
- local routing layer
Authoritative decider
- authoritative
PartitionMap
Precondition
- partition map reasonably current
Transition
- none
Response
{shard_id, leader_node}
Failure cases
- stale map leads to redirect/retry
Sequence
- entry node hashes key
- looks up shard in partition map
- forwards request to current leader
6. Reassign shard ownership #
API
- internal failover flow:
AcquireShardLease(shard_id)- or leader election RPCs
Initiator
- system
Entry point
- coordination layer / candidate shard node
Authoritative decider
- lease/coordination store or shard quorum
Precondition
- current owner failed/relinquished
- new owner eligible and current enough
Transition
PartitionOwnership(shard_id)moves to new leader with new epoch- old owner fenced
Response
- updated shard owner in routing metadata
Failure cases
- split ownership attempt
- stale previous owner resumes and must be rejected by epoch
Sequence
- leader fails or loses lease
- candidate acquires shard lease
- routing metadata updates with new leader/epoch
- new leader serves requests
- stale old leader is fenced
Entry Point vs Decider vs Responder #
| Path | Entry point | Authoritative decider | Physical responder | Logical responder |
|---|---|---|---|---|
Allow | gateway / any node | shard leader | shard leader or front node | rate limiter service |
GetPolicy | config endpoint / any node | policy store | config node | rate limiter service |
PutPolicy | admin endpoint | policy store leader | config leader | rate limiter service |
ResolveShard | entry node | partition map source | entry node | rate limiter service |
| shard failover | candidate node / coordination layer | lease store or shard quorum | new owner / coordination layer | rate limiter service |
Concrete HLD #
Main components:
- client gateway / router
- hashes
limit_key -> shard - forwards to current shard leader
- hashes
- shard leader
- authoritative evaluator for keys in that shard
- holds
RateLimitStatein memory - applies allow/deny transitions
- shard followers / replicas (if strict durability required)
- receive replicated state/log
- policy/config service
- stores
LimitPolicy - versions policy updates
- stores
- coordination layer
- stores shard ownership lease
- fences stale leaders
Concrete Technology Realizations #
Stronger infra-native answer #
- Go or Java rate limiter service
- etcd for shard leases and routing metadata
- in-memory per-shard limiter state
- policy in etcd or internal config DB
- optional WAL / replicated log for crash-safe strictness
Simpler pragmatic answer #
- Redis Cluster
- shard by key
- use Lua script on authoritative Redis shard for atomic allow/deny evaluation
- policy in Redis hash or config service
- still need external shard ownership / routing discipline if you truly want distributed failover semantics
That second answer is practical, but the first one is better aligned with the mechanical derivation.
Short interview version:
“I’d implement the strict distributed rate limiter as a sharded service. Each key hashes to one shard leader, and that leader is the only authority allowed to evaluate and mutate limiter state for that key. Policy is versioned config stored separately, shard ownership is lease-backed with fencing, and the hot path runs against in-memory state on the shard leader for low latency. If stronger crash safety is required, I’d add a per-shard replicated log or WAL before returning allow decisions.”
Why There Is No Mutation Log By Default #
A rate limiter often does not need an authoritative append-only history of every decision.
The main truth is usually:
- current limiter state for a key
- not the full sequence of all past allows/denies
So mechanically, this system is centered on:
RateLimitStateLimitPolicy
not on an append-only event stream.
Why no mutation log by default #
A mutation log is useful when you need one or more of:
- durable replay as primary recovery mechanism
- audit trail
- downstream consumers of every event
- replication via ordered event shipping
- exact historical reconstruction
For a basic distributed rate limiter, you often do not need all that. You only need the current state to answer:
- allow or deny now?
So the baseline can be:
- in-memory authoritative state per shard
- maybe periodic checkpointing
- maybe replication of current state
- no full decision log
When you would add a mutation log #
You would add one if the prompt or requirements demand:
- strict crash recovery with no undercount after node failure
- exact replay/reconstruction of limiter state
- durable replicated state-machine semantics
- auditability of rate-limit decisions
- analytics derived from exact decision stream
Then the design shifts toward:
- append log of
ALLOW_CONSUME,DENY,REFILL,ROTATE - rebuild
RateLimitStatefrom the log or snapshots
Replication without log-first modeling #
Replication can still exist. It is just not necessarily log-based replication in the baseline answer.
Three levels:
No replication
- acceptable only for a weak/best-effort limiter
- simple, but loses state on node failure
State replication
- replicate current
RateLimitStateor shard snapshots to a standby/follower - enough for many practical limiters
- replicate current
Log replication
- replicate every limiter mutation in order
- strongest, but more expensive
So:
- replication? often yes
- mutation log as primary modeling object? not always necessary unless the prompt pushes for strong durability/replay semantics
Token Bucket vs Fixed Window #
This mostly changes the shape of RateLimitState and the exact allow_request transition. The rest of the distributed architecture can stay similar.
Fixed Window #
RateLimitState #
For key K:
window_idcount
Example:
- 100 requests per minute
window_id = floor(now / 60s)count = 37
Allow transition #
- compute current
window_id - if
window_id != stored window_id- reset state to new window
count = 0
- if
count < limit- increment
count - allow
- increment
- else
- deny
Pros #
- very simple
- cheap state
- easy to shard
Cons #
- boundary burst problem
- user can consume near end of one window and start of next
Write-shape interpretation #
Still:
guarded state transitionbecause allow is valid only ifcount < limitin current window
Sliding Window Log #
RateLimitState #
For key K:
- timestamps of recent requests
- or bucketed counts across rolling subwindows
Example:
- deque/ring of timestamps
- or 12 buckets for last 60s
Allow transition #
- drop entries outside rolling window
- count recent requests
- if count < limit
- add current request
- allow
- else
- deny
Pros #
- more accurate than fixed window
Cons #
- larger state
- more expensive cleanup
Write-shape #
Still:
guarded state transition
Token Bucket #
RateLimitState #
For key K:
tokenslast_refill_ts
Example:
- capacity = 100
- refill_rate = 10 tokens/sec
Allow transition #
- compute elapsed time since
last_refill_ts - refill:
tokens = min(capacity, tokens + elapsed * refill_rate)
- set
last_refill_ts = now - if
tokens >= costtokens -= cost- allow
- else
- deny
Pros #
- smooths traffic
- supports burst + steady refill naturally
- common practical choice
Cons #
- needs careful time handling
- clock monotonicity matters more
Write-shape #
Still:
guarded state transition
The guard is:
refilled_tokens >= request_cost
Leaky Bucket #
RateLimitState #
- queue depth or backlog
- last drain timestamp
This is similar to token bucket but models drainage instead of refill.
Main Difference In The Framework #
The distributed system structure mostly stays the same:
- shard key to one owner
- authoritative
RateLimitState - versioned
LimitPolicy - lease-backed shard ownership
What changes is:
- exact fields in
RateLimitState - exact guard in
allow_request - cleanup/rotation behavior
Comparison #
| Algorithm | State | Guard for allow | Cleanup behavior |
|---|---|---|---|
| Fixed window | window_id, count | count < limit | reset on new window |
| Sliding window | timestamps or rolling buckets | recent_count < limit | evict old entries/buckets |
| Token bucket | tokens, last_refill_ts | refilled_tokens >= cost | refill on read/write |
| Leaky bucket | backlog, last_drain_ts | backlog below threshold | drain over time |
Interview guidance:
If prompt does not specify, I’d usually choose:
- token bucket if:
- want burst support
- smoother behavior
- realistic API/infra answer
Choose:
- fixed window if:
- interviewer wants simplest baseline
- focus is distributed ownership/routing, not algorithm nuance
A short interview line:
“The distributed architecture doesn’t change much across algorithms; what changes is the state machine stored in
RateLimitState. Fixed window stores(window_id, count), while token bucket stores(tokens, last_refill_ts). In both cases,Allowis still a guarded state transition applied by the authoritative shard owner.”