SQS-Class Message Queue Analysis Note
Table of Contents
SQS-Class Message Queue Analysis Note #
This note captures the full step-by-step analysis for an SQS-class message queue: at-least-once delivery, consumer pull, visibility timeout, and delete-based acknowledgment.
Step 1 — Normalize #
Assume the baseline prompt is:
- design an SQS-class message queue
- producers send messages
- consumers pull messages
- delivery is at-least-once
- consumers delete/ack after processing
- messages become visible again after timeout if not deleted
Normalize into state-affecting paths.
| Requirement | Actor | Operation | State touched | Priority |
|---|---|---|---|---|
| Producer sends message | Client | append event | S1create targetQueueMessage | C1 |
| Consumer receives next available message | Client | state transition | S1update targetDeliveryState | C1 |
| Consumer deletes acknowledged message | Client | state transition | S1update targetDeliveryState | C1 |
| System makes timed-out in-flight message visible again | System | async process | S1hidden write targetDeliveryState | C1 |
| Client reads approximate queue depth/status | Client | read projection | S1read projection targetQueueStatusView | R2 |
| System routes queue/shard to current owner | System | read source | S1read source targetPartitionMap | C1 |
| System reassigns shard ownership after node failure | System | state transition | S1update targetPartitionOwnership | C1 |
Notes on normalization:
Important choices:
- send is
append event- message existence is an immutable enqueue fact
- receive is
state transition- because message moves:
- visible -> in-flight
- because message moves:
- delete/ack is
state transition- message lifecycle changes to terminal/deleted
- timeout re-visibility is
async process- internal lifecycle transition
- routing/ownership are explicit because this is distributed infra
Likely C1:
- enqueue
- receive/claim
- delete/ack
- timeout re-visibility
- shard routing
- ownership reassignment
This system is already clearly:
- queue delivery with in-flight claim not:
- replayable messaging log
Step 2 — Critical Path Selection #
| Requirement | Priority class | Why |
|---|---|---|
| Send message | C1 | enqueue truth must not be lost or duplicated incorrectly |
| Receive next available message | C1 | queue delivery correctness depends on valid claim/in-flight transition |
| Delete acknowledged message | C1 | ack/delete determines completion and redelivery behavior |
| Re-visible timed-out message | C1 | at-least-once semantics depend on correct timeout recovery |
| Read queue depth/status | R2 | operational and approximate in many real systems |
| Route queue/shard to current owner | C1 | wrong routing can break ownership and visibility semantics |
| Reassign shard ownership after node failure | C1 | failover must preserve in-flight and visible message correctness |
Baseline critical paths:
Main C1 paths:
P1send messageP2receive/claim messageP3delete acknowledged messageP4timeout re-visibilityP5route to shard ownerP6reassign shard ownership
Operational/non-core:
- queue depth/status is not correctness-critical in the baseline
The core truth here is not just message storage. It is:
- message existence
- delivery lifecycle
- exclusive in-flight ownership during visibility timeout
- safe reappearance after timeout
So the system is driven by:
- enqueue semantics
- in-flight claim lifecycle
- ack/delete semantics
- ownership/failover
Step 3 — Primary State Extraction #
For an SQS-class queue, the minimal primary state is the message itself, its delivery lifecycle, and shard ownership/routing state.
| Candidate object label | Candidate source | Candidate needed for C1/R1? | Candidate decomposition action | Class | Primary? | Owner | Evolution | Scope kind | Scope value |
|---|---|---|---|---|---|---|---|---|---|
| QueueMessage | direct noun | Yes | keep as candidate | event | Yes | service | append-only | instance | message_id |
| DeliveryState | lifecycle object | Yes | keep as candidate | process | Yes | service | state machine | instance | message_id |
| PartitionOwnership | hidden write target | Yes | keep as candidate | process | Yes | service | state machine | instance | shard_id |
| PartitionMap | hidden write target | Yes | keep as candidate | entity | Yes | service | overwrite | collection | queue shards |
| QueueStatusView | derived read model | No | reject as UI artifact | projection | No | derived | overwrite | collection | queue_id |
| ReceiveAttempt | hidden write target | No | reject as implementation choice | event | No | derived | append-only | collection | message_id |
| VisibilityIndex | hidden write target | No | reject as implementation choice | projection | No | derived | overwrite | collection | shard_id |
Important modeling choices:
QueueMessage #
Primary because:
- enqueue is an immutable fact
- payload and metadata live here
DeliveryState #
This is the key queue-specific object. It captures lifecycle like:
VISIBLEIN_FLIGHT(owner, expiry, receipt_handle/epoch)DELETED
This is what makes the system queue-like rather than log-like.
PartitionOwnership #
Needed because:
- one owner should control delivery transitions for a shard at a time
PartitionMap #
Needed because:
- all send/receive/delete flows must route consistently to the right authority
Minimal strict primary set:
QueueMessageDeliveryStatePartitionOwnershipPartitionMap
That is enough to explain at-least-once pull delivery.
Step 4 — Hard Invariants #
For an SQS-class queue, the hard invariants are about message existence, exclusive in-flight ownership during visibility timeout, valid delete, and safe timeout re-visibility.
| Path | Tier | Type | Invariant template | Invariant statement |
|---|---|---|---|---|
P1write pathSend message | HARD | uniqueness | uniqueness template | Key send request_id maps to at most one logical outcome enqueued message within queue scope. |
P2write pathReceive/claim message | HARD | eligibility | eligibility template | Action claim_message is valid only if DeliveryState(message_id) is VISIBLE at decision time. |
P2write pathReceive/claim message | HARD | uniqueness | uniqueness template | Key message_id maps to at most one logical outcome active in-flight holder within visibility-timeout scope. |
P3write pathDelete acknowledged message | HARD | eligibility | eligibility template | Action delete_message is valid only if DeliveryState(message_id) is IN_FLIGHT and receipt_handle/epoch matches current holder at decision time. |
P4write pathTimeout re-visibility | HARD | eligibility | eligibility template | Action make_visible_again is valid only if DeliveryState(message_id) is IN_FLIGHT and visibility timeout has expired and receipt_handle/epoch is still current at decision time. |
P5write pathRoute to shard owner | HARD | uniqueness | uniqueness template | Key shard_id maps to at most one logical outcome current authoritative owner within shard_id. |
P6write pathReassign shard ownership | HARD | eligibility | eligibility template | Action reassign_shard is valid only if current owner is failed or relinquished and candidate owner is eligible and sufficiently current on shard_id at decision time. |
What matters most:
1. One active in-flight holder per message #
This is the central queue invariant.
2. Delete is guarded by current receipt/epoch #
A stale consumer must not be able to delete a message after timeout and reassignment.
3. Timeout recovery is guarded #
Only the currently active in-flight state may be returned to visible.
4. At-least-once means duplicates are allowed #
Notice what is not an invariant:
- “message is delivered exactly once”
This system intentionally allows:
- redelivery after timeout
- duplicates after consumer failure
Step 5 — Execution Context #
For the strict baseline SQS-class queue:
| Field | Value | Why |
|---|---|---|
| Topology | single service distributed | one logical queue service spread across many nodes |
| Write coordination scope | per object scope | correctness is per message_id and per shard ownership scope |
| Read consistency target | strong only | receive/delete must use authoritative delivery state |
| Holder model | client | consumers temporarily hold in-flight messages |
| Compensation acceptable? | No | wrong delete or split in-flight ownership cannot be repaired safely afterward |
Derived implications:
holder_may_crash = true- consumers can fail while holding in-flight work
cross_service_write = false- baseline keeps message state, routing, and ownership within one logical queue service
bounded_staleness_allowed = false- receive/delete paths must use authoritative delivery state
cross_service_atomicity_required = false- no multi-service transaction required in baseline
exclusive_claim_required = true- in-flight ownership must be exclusive for a message at one time
guarded_by_current_state = true- receive/delete/re-visibility all depend on current lifecycle state
This pushes us toward:
- one authoritative owner per shard
- explicit lifecycle state machine for each message
- visibility timeout modeled as lease-like in-flight ownership
- strong reads on delivery-state hot path
Step 6 — Deterministic Mechanism Selection #
6A. Write Shape #
| Path | Why | Write shape |
|---|---|---|
P1 send message | immutable enqueue fact | append-only event |
P2 receive/claim message | one active in-flight holder per message | exclusive claim |
P3 delete acknowledged message | valid only for current in-flight holder/receipt | guarded state transition |
P4 timeout re-visibility | valid only if current in-flight lease expired and still current | guarded state transition |
P5 route to shard owner | one current authoritative owner per shard | exclusive claim |
P6 reassign shard ownership | valid only if ownership/failover state allows it | guarded state transition |
6B. Base Mechanism #
| Path | Write shape | Base mechanism | Required companions |
|---|---|---|---|
P1 send message | append-only event | append log | idempotency key |
P2 receive/claim message | exclusive claim | lease | receipt handle or epoch, visibility timeout |
P3 delete acknowledged message | guarded state transition | CAS on (state, version) or leader-applied guarded transition | receipt handle or epoch |
P4 timeout re-visibility | guarded state transition | leader-applied guarded transition | receipt handle or epoch, timeout scan |
P5 route to shard owner | exclusive claim | lease | fencing token, heartbeat |
P6 reassign shard ownership | guarded state transition | CAS on (state, version) | fencing token, shard catch-up check |
Why these fit:
Send #
Enqueue is immutable fact recording:
- very naturally append-only
Receive/claim #
This is the core SQS-specific mechanism:
- message becomes invisible for a visibility timeout
- one consumer owns it temporarily
That is effectively a lease.
Delete #
Delete is not just a blind delete. It must verify:
- message is still in-flight
- receipt handle / epoch still matches So it is a guarded transition.
Timeout re-visibility #
Also guarded:
- only expired current in-flight state can be moved back to visible
Canonical substrate implied:
- sharded queue service
- one owner per shard
- append-only message storage
- per-message delivery lifecycle state
- receipt-handle/epoch-based guarded delete
- lease-like in-flight visibility timeout
Step 7 — Read Model / Source of Truth #
For an SQS-class queue, truth is mostly direct source state. Operational status is derived.
| Concept | Truth | Read path | Rebuild path |
|---|---|---|---|
C1source conceptEnqueued message payload | QueueMessage | read source directly | authoritative message store |
C2source conceptCurrent delivery lifecycle | DeliveryState | read source directly | authoritative delivery-state store or replay from committed log |
C3source conceptShard ownership | PartitionOwnership | read source directly | authoritative ownership store |
C4source conceptShard routing map | PartitionMap | read source directly | authoritative routing metadata |
C5projection conceptApproximate queue depth/status | derived from message and delivery state | materialized view | recompute from authoritative state |
Important point:
For the core queue semantics:
- send writes authoritative message state
- receive/delete/re-visibility all operate on authoritative
DeliveryState - queue depth and status are often approximate projections in real systems
So the correctness-critical read path is:
- direct read of source delivery state and shard ownership
Step 8 — Failure Handling #
| Path | Retry | Competing writers | Crash after commit | Publish failure | Stale holder |
|---|---|---|---|---|---|
P1 send message | retry with send request_id to avoid duplicate enqueue | competing sends coexist; dedup only matters for same logical request | committed enqueue survives owner crash if replicated past commit point | producer may retry safely with idempotency key | n/a |
P2 receive/claim message | retry receive safely; may return another message or later same message | only one active in-flight claim should win for a message at one time | if claim committed and consumer crashes, message stays invisible until timeout | n/a | stale consumer fenced by receipt handle/epoch |
P3 delete acknowledged message | retry delete with same receipt handle | stale/wrong receipt handle loses guarded transition | committed delete survives crash if replicated past commit point | n/a | stale holder cannot delete after timeout and re-claim |
P4 timeout re-visibility | timeout scan/retry safe | only one re-visibility transition should win for current expired in-flight state | scanner crash delays recovery; next scan retries | n/a | old receipt handle becomes invalid after re-visibility |
P5 route to shard owner | retry after refreshing shard map | only one valid owner should exist | if owner changed, refreshed map points to new owner | n/a | stale owner rejected by fencing token |
P6 reassign shard ownership | retry failover transition safely | only one reassignment wins current ownership state | promoted owner crash triggers later reassignment | n/a | old owner fenced and must not continue serving |
What matters most:
1. Receipt handle / epoch fencing #
This is the key queue safety mechanism.
Bad case:
- consumer A receives message
- times out
- consumer B later receives same message
- consumer A finally sends delete
Without fencing:
- A could incorrectly delete B’s in-flight message
So:
- delete must be tied to current receipt handle/epoch
- stale receipt handles must fail
2. At-least-once means duplicates are okay #
A consumer may process the same logical message more than once if:
- it crashes after processing but before delete
- timeout expires and message becomes visible again
So application consumers must be idempotent or dedup-aware.
3. Send retry dedup #
If the send API promises producer idempotency:
- add request-id dedup on enqueue Otherwise:
- retries can create multiple valid messages
Failure summary:
This queue stays correct if:
- enqueue is durable/idempotent as needed
- in-flight claims are exclusive and time-bounded
- deletes are guarded by current receipt handle/epoch
- re-visibility happens only after current timeout expiry
- stale shard owners and stale consumers are fenced
Step 9 — Scale Adjustments #
| Hotspot | Type | First response |
|---|---|---|
| hot shard with many sends/receives | contention hotspot | increase shard count and rebalance queues across shards |
| in-flight visibility scans | write throughput hotspot | bucket expiries by time and scan incrementally instead of full scans |
| delete/receive pressure on same shard | write throughput hotspot | isolate hot queues, batch receives/deletes where possible |
| ownership churn during failures | contention hotspot | stabilize leases/elections and avoid aggressive reassignment |
| approximate depth/status queries | read hotspot | keep them as derived views only, not source-path reads |
| large message backlog growth | storage growth hotspot | segment storage, compaction/retention for deleted messages, archive/DLQ paths |
What scales well:
This queue scales by:
- sharding queues/messages
- giving each shard one authoritative owner
- keeping send/receive/delete local to that owner
- using time-bucketed visibility indexes for timeout recovery
Throughput grows mainly with:
- number of shards
- balance of queue traffic across shards
- efficiency of visibility-timeout management
What fails first:
Usually:
- hot queues collapsing onto one shard
- timeout scan inefficiency
- stale-owner fencing mistakes
- delete/receive contention on delivery-state hot path
Canonical design conclusion:
The mechanical outcome is:
- primary state:
QueueMessageDeliveryStatePartitionOwnershipPartitionMap
- critical invariants:
- enqueue durability/idempotency as required
- one active in-flight holder per message
- delete valid only for current receipt handle/epoch
- timed-out messages become visible again safely
- mechanisms:
append loglease- guarded delete/re-visibility transitions
- fenced shard ownership
- reads:
- direct authoritative reads for message/delivery state
- projections only for approximate queue depth/status
Polished interview answer:
“I’d design the SQS-class queue as a sharded service with one authoritative owner per shard. Sending a message is an append-only enqueue fact. Receiving a message is an exclusive claim that moves its delivery state from visible to in-flight under a visibility timeout, and delete is a guarded transition that is valid only for the current receipt handle or epoch. If a consumer crashes or never deletes, the timeout path safely returns the message to visible, which gives at-least-once delivery. The main scaling levers are more shards, hot-queue isolation, and efficient time-bucketed visibility scanning.”
Concrete Substrate #
I’ll choose a service-owned sharded queue with shard leaders as the concrete baseline, because that fits the mechanics we derived:
- append-only enqueue
- lease-like in-flight claim
- guarded delete/re-visibility
- one owner per shard
Concrete substrate:
- queue service cluster
- queues split into shards
- one leader/owner per shard
- shard leader stores:
- append-only message records
- current
DeliveryStateper message - ready queue / visible index
- in-flight expiry index
- followers replicate shard mutations
- metadata layer tracks:
- shard ownership
- routing map
Concrete tech family:
- service in Go or Java
- local durable storage:
- RocksDB or log-segment files
- shard replication:
- Raft or leader-follower replication with commit index
- metadata/control:
- etcd or internal metadata quorum
This is stronger and cleaner than “just use SQS,” while still concrete.
Operation Layer #
1. Send message #
API
SendMessage(queue_id, payload, request_id?)
Initiator
- producer/client
Entry point
- gateway or any queue node
Authoritative decider
- current leader for target shard
Precondition
- valid shard routing
- optional dedup request id if producer idempotency is required
Transition
- append
QueueMessage - initialize
DeliveryState = VISIBLE
Response
{message_id}
Failure cases
- stale routing -> retry with updated shard map
- response loss -> retry may duplicate unless
request_iddedup is used
Sequence
- client sends
SendMessage - entry node picks shard for queue
- forwards to shard leader
- leader appends message record
- leader sets
DeliveryState = VISIBLE - mutation commits to replicas
- leader replies with
message_id
2. Receive message #
API
ReceiveMessage(queue_id, max_messages, visibility_timeout)
Initiator
- consumer/client
Entry point
- gateway or any queue node
Authoritative decider
- current shard leader(s) serving that queue
Precondition
- message chosen must currently be
VISIBLE
Transition
- selected message:
VISIBLE -> IN_FLIGHT(receipt_handle, consumer_id?, expiry)
- remove from visible index
- insert into expiry index
Response
{messages: [{message_id, payload, receipt_handle, visibility_timeout_expiry}]}
Failure cases
- no visible message -> empty response / long poll timeout
- stale owner -> retry after redirect
- duplicate receive later is allowed if timeout expires
Sequence
- consumer sends
ReceiveMessage - entry node resolves shard(s)
- shard leader selects visible messages
- for each selected message, leader creates new receipt handle/epoch
- leader commits delivery-state transitions
- leader returns messages + receipt handles
3. Delete message #
API
DeleteMessage(queue_id, receipt_handle)
Initiator
- consumer/client
Entry point
- gateway or any queue node
Authoritative decider
- shard leader owning the message
Precondition
- current
DeliveryStateisIN_FLIGHT receipt_handlematches current in-flight holder/epoch
Transition
IN_FLIGHT -> DELETED
Response
- success / no-op failure
Failure cases
- stale receipt handle -> reject
- message already re-visible and re-claimed -> reject stale delete
Sequence
- consumer sends
DeleteMessage - entry node routes to owning shard
- leader validates current receipt handle
- leader commits
DELETEDtransition - leader removes message from expiry tracking
- response returned
4. Timeout re-visibility #
API
- internal background process
Initiator
- system
Entry point
- shard leader
Authoritative decider
- shard leader
Precondition
- message is still
IN_FLIGHT expiry <= now- receipt handle/epoch still current
Transition
IN_FLIGHT -> VISIBLE- remove from expiry index
- add back to visible index
Response
- internal success
Failure cases
- message already deleted -> no-op
- message handle changed -> stale timeout worker no-op
Sequence
- leader scans nearest expiry bucket / min-heap
- finds expired in-flight messages
- validates state still matches current receipt handle/epoch
- commits
VISIBLEtransition - reinserts into visible queue
5. Route to shard owner #
API
- internal lookup:
ResolveShard(queue_id, message_key?)
Initiator
- entry node
Entry point
- local router
Authoritative decider
- authoritative shard map
Precondition
- routing table reasonably current
Transition
- none
Response
{shard_id, leader_node}
Failure cases
- stale map -> redirect/retry
6. Reassign shard ownership #
API
- internal failover flow:
AcquireShardLease(shard_id)- or Raft leader election
Initiator
- system
Entry point
- follower/candidate node or coordination layer
Authoritative decider
- shard quorum or metadata lease store
Precondition
- current owner failed or relinquished
- candidate replica sufficiently current
Transition
- new owner/leader epoch established
- old owner fenced
Response
- updated ownership metadata
Failure cases
- split election
- stale owner resumes and must be rejected
Sequence
- leader fails or loses lease
- follower/quorum elects new leader
- ownership metadata updated
- new leader resumes receive/delete/send
- old leader fenced by higher epoch/term
Entry Point vs Decider vs Responder #
| Path | Entry point | Authoritative decider | Physical responder | Logical responder |
|---|---|---|---|---|
SendMessage | gateway / any node | shard leader | leader or front node | queue service |
ReceiveMessage | gateway / any node | shard leader | leader or front node | queue service |
DeleteMessage | gateway / any node | shard leader | leader or front node | queue service |
| timeout recovery | shard leader | shard leader | internal | queue service |
| shard failover | follower / coordination layer | shard quorum / lease store | new leader / control plane | queue service |
Concrete HLD #
Main components:
- gateway/router
- resolves queue/shard
- forwards client requests
- shard leader
- authoritative owner of message lifecycle for shard
- manages visible and in-flight indexes
- shard followers
- replicate shard mutations
- metadata/control service
- tracks shard ownership and routing
- background timeout worker
- usually runs on shard leader using expiry index
Concrete Technology Realizations #
Stronger infra-native answer #
- Go or Java queue service
- RocksDB or log-segment files for durable message and state storage
- Raft or leader-follower replication per shard
- etcd or internal metadata quorum for shard ownership/routing
Simpler pragmatic variant #
- broker service plus persistent store
- internal receipt-handle state in DB/LSM
- still require leader/shard ownership and timeout scanning
Short interview version:
“I’d build the SQS-class queue as a sharded service with one leader per shard. Enqueue appends an immutable message record and marks it visible. Receive is a lease-like claim that makes the message invisible and assigns a receipt handle with an expiry. Delete is a guarded transition that succeeds only for the current receipt handle, and if the timeout expires first, the shard leader safely moves the message back to visible. Replication happens per shard, and failover promotes only a sufficiently current replica.”