Skip to main content
  1. System Design Components/

SQS-Class Message Queue Analysis Note

SQS-Class Message Queue Analysis Note #

This note captures the full step-by-step analysis for an SQS-class message queue: at-least-once delivery, consumer pull, visibility timeout, and delete-based acknowledgment.

Step 1 — Normalize #

Assume the baseline prompt is:

  • design an SQS-class message queue
  • producers send messages
  • consumers pull messages
  • delivery is at-least-once
  • consumers delete/ack after processing
  • messages become visible again after timeout if not deleted

Normalize into state-affecting paths.

RequirementActorOperationState touchedPriority
Producer sends messageClientappend eventS1
create target
QueueMessage
C1
Consumer receives next available messageClientstate transitionS1
update target
DeliveryState
C1
Consumer deletes acknowledged messageClientstate transitionS1
update target
DeliveryState
C1
System makes timed-out in-flight message visible againSystemasync processS1
hidden write target
DeliveryState
C1
Client reads approximate queue depth/statusClientread projectionS1
read projection target
QueueStatusView
R2
System routes queue/shard to current ownerSystemread sourceS1
read source target
PartitionMap
C1
System reassigns shard ownership after node failureSystemstate transitionS1
update target
PartitionOwnership
C1

Notes on normalization:

Important choices:

  • send is append event
    • message existence is an immutable enqueue fact
  • receive is state transition
    • because message moves:
      • visible -> in-flight
  • delete/ack is state transition
    • message lifecycle changes to terminal/deleted
  • timeout re-visibility is async process
    • internal lifecycle transition
  • routing/ownership are explicit because this is distributed infra

Likely C1:

  • enqueue
  • receive/claim
  • delete/ack
  • timeout re-visibility
  • shard routing
  • ownership reassignment

This system is already clearly:

  • queue delivery with in-flight claim not:
  • replayable messaging log

Step 2 — Critical Path Selection #

RequirementPriority classWhy
Send messageC1enqueue truth must not be lost or duplicated incorrectly
Receive next available messageC1queue delivery correctness depends on valid claim/in-flight transition
Delete acknowledged messageC1ack/delete determines completion and redelivery behavior
Re-visible timed-out messageC1at-least-once semantics depend on correct timeout recovery
Read queue depth/statusR2operational and approximate in many real systems
Route queue/shard to current ownerC1wrong routing can break ownership and visibility semantics
Reassign shard ownership after node failureC1failover must preserve in-flight and visible message correctness

Baseline critical paths:

Main C1 paths:

  • P1 send message
  • P2 receive/claim message
  • P3 delete acknowledged message
  • P4 timeout re-visibility
  • P5 route to shard owner
  • P6 reassign shard ownership

Operational/non-core:

  • queue depth/status is not correctness-critical in the baseline

The core truth here is not just message storage. It is:

  • message existence
  • delivery lifecycle
  • exclusive in-flight ownership during visibility timeout
  • safe reappearance after timeout

So the system is driven by:

  • enqueue semantics
  • in-flight claim lifecycle
  • ack/delete semantics
  • ownership/failover

Step 3 — Primary State Extraction #

For an SQS-class queue, the minimal primary state is the message itself, its delivery lifecycle, and shard ownership/routing state.

Candidate object labelCandidate sourceCandidate needed for C1/R1?Candidate decomposition actionClassPrimary?OwnerEvolutionScope kindScope value
QueueMessagedirect nounYeskeep as candidateeventYesserviceappend-onlyinstancemessage_id
DeliveryStatelifecycle objectYeskeep as candidateprocessYesservicestate machineinstancemessage_id
PartitionOwnershiphidden write targetYeskeep as candidateprocessYesservicestate machineinstanceshard_id
PartitionMaphidden write targetYeskeep as candidateentityYesserviceoverwritecollectionqueue shards
QueueStatusViewderived read modelNoreject as UI artifactprojectionNoderivedoverwritecollectionqueue_id
ReceiveAttempthidden write targetNoreject as implementation choiceeventNoderivedappend-onlycollectionmessage_id
VisibilityIndexhidden write targetNoreject as implementation choiceprojectionNoderivedoverwritecollectionshard_id

Important modeling choices:

QueueMessage #

Primary because:

  • enqueue is an immutable fact
  • payload and metadata live here

DeliveryState #

This is the key queue-specific object. It captures lifecycle like:

  • VISIBLE
  • IN_FLIGHT(owner, expiry, receipt_handle/epoch)
  • DELETED

This is what makes the system queue-like rather than log-like.

PartitionOwnership #

Needed because:

  • one owner should control delivery transitions for a shard at a time

PartitionMap #

Needed because:

  • all send/receive/delete flows must route consistently to the right authority

Minimal strict primary set:

  • QueueMessage
  • DeliveryState
  • PartitionOwnership
  • PartitionMap

That is enough to explain at-least-once pull delivery.

Step 4 — Hard Invariants #

For an SQS-class queue, the hard invariants are about message existence, exclusive in-flight ownership during visibility timeout, valid delete, and safe timeout re-visibility.

PathTierTypeInvariant templateInvariant statement
P1
write path
Send message
HARDuniquenessuniqueness templateKey send request_id maps to at most one logical outcome enqueued message within queue scope.
P2
write path
Receive/claim message
HARDeligibilityeligibility templateAction claim_message is valid only if DeliveryState(message_id) is VISIBLE at decision time.
P2
write path
Receive/claim message
HARDuniquenessuniqueness templateKey message_id maps to at most one logical outcome active in-flight holder within visibility-timeout scope.
P3
write path
Delete acknowledged message
HARDeligibilityeligibility templateAction delete_message is valid only if DeliveryState(message_id) is IN_FLIGHT and receipt_handle/epoch matches current holder at decision time.
P4
write path
Timeout re-visibility
HARDeligibilityeligibility templateAction make_visible_again is valid only if DeliveryState(message_id) is IN_FLIGHT and visibility timeout has expired and receipt_handle/epoch is still current at decision time.
P5
write path
Route to shard owner
HARDuniquenessuniqueness templateKey shard_id maps to at most one logical outcome current authoritative owner within shard_id.
P6
write path
Reassign shard ownership
HARDeligibilityeligibility templateAction reassign_shard is valid only if current owner is failed or relinquished and candidate owner is eligible and sufficiently current on shard_id at decision time.

What matters most:

1. One active in-flight holder per message #

This is the central queue invariant.

2. Delete is guarded by current receipt/epoch #

A stale consumer must not be able to delete a message after timeout and reassignment.

3. Timeout recovery is guarded #

Only the currently active in-flight state may be returned to visible.

4. At-least-once means duplicates are allowed #

Notice what is not an invariant:

  • “message is delivered exactly once”

This system intentionally allows:

  • redelivery after timeout
  • duplicates after consumer failure

Step 5 — Execution Context #

For the strict baseline SQS-class queue:

FieldValueWhy
Topologysingle service distributedone logical queue service spread across many nodes
Write coordination scopeper object scopecorrectness is per message_id and per shard ownership scope
Read consistency targetstrong onlyreceive/delete must use authoritative delivery state
Holder modelclientconsumers temporarily hold in-flight messages
Compensation acceptable?Nowrong delete or split in-flight ownership cannot be repaired safely afterward

Derived implications:

  • holder_may_crash = true

    • consumers can fail while holding in-flight work
  • cross_service_write = false

    • baseline keeps message state, routing, and ownership within one logical queue service
  • bounded_staleness_allowed = false

    • receive/delete paths must use authoritative delivery state
  • cross_service_atomicity_required = false

    • no multi-service transaction required in baseline
  • exclusive_claim_required = true

    • in-flight ownership must be exclusive for a message at one time
  • guarded_by_current_state = true

    • receive/delete/re-visibility all depend on current lifecycle state

This pushes us toward:

  • one authoritative owner per shard
  • explicit lifecycle state machine for each message
  • visibility timeout modeled as lease-like in-flight ownership
  • strong reads on delivery-state hot path

Step 6 — Deterministic Mechanism Selection #

6A. Write Shape #

PathWhyWrite shape
P1 send messageimmutable enqueue factappend-only event
P2 receive/claim messageone active in-flight holder per messageexclusive claim
P3 delete acknowledged messagevalid only for current in-flight holder/receiptguarded state transition
P4 timeout re-visibilityvalid only if current in-flight lease expired and still currentguarded state transition
P5 route to shard ownerone current authoritative owner per shardexclusive claim
P6 reassign shard ownershipvalid only if ownership/failover state allows itguarded state transition

6B. Base Mechanism #

PathWrite shapeBase mechanismRequired companions
P1 send messageappend-only eventappend logidempotency key
P2 receive/claim messageexclusive claimleasereceipt handle or epoch, visibility timeout
P3 delete acknowledged messageguarded state transitionCAS on (state, version) or leader-applied guarded transitionreceipt handle or epoch
P4 timeout re-visibilityguarded state transitionleader-applied guarded transitionreceipt handle or epoch, timeout scan
P5 route to shard ownerexclusive claimleasefencing token, heartbeat
P6 reassign shard ownershipguarded state transitionCAS on (state, version)fencing token, shard catch-up check

Why these fit:

Send #

Enqueue is immutable fact recording:

  • very naturally append-only

Receive/claim #

This is the core SQS-specific mechanism:

  • message becomes invisible for a visibility timeout
  • one consumer owns it temporarily

That is effectively a lease.

Delete #

Delete is not just a blind delete. It must verify:

  • message is still in-flight
  • receipt handle / epoch still matches So it is a guarded transition.

Timeout re-visibility #

Also guarded:

  • only expired current in-flight state can be moved back to visible

Canonical substrate implied:

  • sharded queue service
  • one owner per shard
  • append-only message storage
  • per-message delivery lifecycle state
  • receipt-handle/epoch-based guarded delete
  • lease-like in-flight visibility timeout

Step 7 — Read Model / Source of Truth #

For an SQS-class queue, truth is mostly direct source state. Operational status is derived.

ConceptTruthRead pathRebuild path
C1
source concept
Enqueued message payload
QueueMessageread source directlyauthoritative message store
C2
source concept
Current delivery lifecycle
DeliveryStateread source directlyauthoritative delivery-state store or replay from committed log
C3
source concept
Shard ownership
PartitionOwnershipread source directlyauthoritative ownership store
C4
source concept
Shard routing map
PartitionMapread source directlyauthoritative routing metadata
C5
projection concept
Approximate queue depth/status
derived from message and delivery statematerialized viewrecompute from authoritative state

Important point:

For the core queue semantics:

  • send writes authoritative message state
  • receive/delete/re-visibility all operate on authoritative DeliveryState
  • queue depth and status are often approximate projections in real systems

So the correctness-critical read path is:

  • direct read of source delivery state and shard ownership

Step 8 — Failure Handling #

PathRetryCompeting writersCrash after commitPublish failureStale holder
P1 send messageretry with send request_id to avoid duplicate enqueuecompeting sends coexist; dedup only matters for same logical requestcommitted enqueue survives owner crash if replicated past commit pointproducer may retry safely with idempotency keyn/a
P2 receive/claim messageretry receive safely; may return another message or later same messageonly one active in-flight claim should win for a message at one timeif claim committed and consumer crashes, message stays invisible until timeoutn/astale consumer fenced by receipt handle/epoch
P3 delete acknowledged messageretry delete with same receipt handlestale/wrong receipt handle loses guarded transitioncommitted delete survives crash if replicated past commit pointn/astale holder cannot delete after timeout and re-claim
P4 timeout re-visibilitytimeout scan/retry safeonly one re-visibility transition should win for current expired in-flight statescanner crash delays recovery; next scan retriesn/aold receipt handle becomes invalid after re-visibility
P5 route to shard ownerretry after refreshing shard maponly one valid owner should existif owner changed, refreshed map points to new ownern/astale owner rejected by fencing token
P6 reassign shard ownershipretry failover transition safelyonly one reassignment wins current ownership statepromoted owner crash triggers later reassignmentn/aold owner fenced and must not continue serving

What matters most:

1. Receipt handle / epoch fencing #

This is the key queue safety mechanism.

Bad case:

  • consumer A receives message
  • times out
  • consumer B later receives same message
  • consumer A finally sends delete

Without fencing:

  • A could incorrectly delete B’s in-flight message

So:

  • delete must be tied to current receipt handle/epoch
  • stale receipt handles must fail

2. At-least-once means duplicates are okay #

A consumer may process the same logical message more than once if:

  • it crashes after processing but before delete
  • timeout expires and message becomes visible again

So application consumers must be idempotent or dedup-aware.

3. Send retry dedup #

If the send API promises producer idempotency:

  • add request-id dedup on enqueue Otherwise:
  • retries can create multiple valid messages

Failure summary:

This queue stays correct if:

  • enqueue is durable/idempotent as needed
  • in-flight claims are exclusive and time-bounded
  • deletes are guarded by current receipt handle/epoch
  • re-visibility happens only after current timeout expiry
  • stale shard owners and stale consumers are fenced

Step 9 — Scale Adjustments #

HotspotTypeFirst response
hot shard with many sends/receivescontention hotspotincrease shard count and rebalance queues across shards
in-flight visibility scanswrite throughput hotspotbucket expiries by time and scan incrementally instead of full scans
delete/receive pressure on same shardwrite throughput hotspotisolate hot queues, batch receives/deletes where possible
ownership churn during failurescontention hotspotstabilize leases/elections and avoid aggressive reassignment
approximate depth/status queriesread hotspotkeep them as derived views only, not source-path reads
large message backlog growthstorage growth hotspotsegment storage, compaction/retention for deleted messages, archive/DLQ paths

What scales well:

This queue scales by:

  • sharding queues/messages
  • giving each shard one authoritative owner
  • keeping send/receive/delete local to that owner
  • using time-bucketed visibility indexes for timeout recovery

Throughput grows mainly with:

  • number of shards
  • balance of queue traffic across shards
  • efficiency of visibility-timeout management

What fails first:

Usually:

  • hot queues collapsing onto one shard
  • timeout scan inefficiency
  • stale-owner fencing mistakes
  • delete/receive contention on delivery-state hot path

Canonical design conclusion:

The mechanical outcome is:

  • primary state:
    • QueueMessage
    • DeliveryState
    • PartitionOwnership
    • PartitionMap
  • critical invariants:
    • enqueue durability/idempotency as required
    • one active in-flight holder per message
    • delete valid only for current receipt handle/epoch
    • timed-out messages become visible again safely
  • mechanisms:
    • append log
    • lease
    • guarded delete/re-visibility transitions
    • fenced shard ownership
  • reads:
    • direct authoritative reads for message/delivery state
    • projections only for approximate queue depth/status

Polished interview answer:

“I’d design the SQS-class queue as a sharded service with one authoritative owner per shard. Sending a message is an append-only enqueue fact. Receiving a message is an exclusive claim that moves its delivery state from visible to in-flight under a visibility timeout, and delete is a guarded transition that is valid only for the current receipt handle or epoch. If a consumer crashes or never deletes, the timeout path safely returns the message to visible, which gives at-least-once delivery. The main scaling levers are more shards, hot-queue isolation, and efficient time-bucketed visibility scanning.”

Concrete Substrate #

I’ll choose a service-owned sharded queue with shard leaders as the concrete baseline, because that fits the mechanics we derived:

  • append-only enqueue
  • lease-like in-flight claim
  • guarded delete/re-visibility
  • one owner per shard

Concrete substrate:

  • queue service cluster
  • queues split into shards
  • one leader/owner per shard
  • shard leader stores:
    • append-only message records
    • current DeliveryState per message
    • ready queue / visible index
    • in-flight expiry index
  • followers replicate shard mutations
  • metadata layer tracks:
    • shard ownership
    • routing map

Concrete tech family:

  • service in Go or Java
  • local durable storage:
    • RocksDB or log-segment files
  • shard replication:
    • Raft or leader-follower replication with commit index
  • metadata/control:
    • etcd or internal metadata quorum

This is stronger and cleaner than “just use SQS,” while still concrete.

Operation Layer #

1. Send message #

API

  • SendMessage(queue_id, payload, request_id?)

Initiator

  • producer/client

Entry point

  • gateway or any queue node

Authoritative decider

  • current leader for target shard

Precondition

  • valid shard routing
  • optional dedup request id if producer idempotency is required

Transition

  • append QueueMessage
  • initialize DeliveryState = VISIBLE

Response

  • {message_id}

Failure cases

  • stale routing -> retry with updated shard map
  • response loss -> retry may duplicate unless request_id dedup is used

Sequence

  1. client sends SendMessage
  2. entry node picks shard for queue
  3. forwards to shard leader
  4. leader appends message record
  5. leader sets DeliveryState = VISIBLE
  6. mutation commits to replicas
  7. leader replies with message_id

2. Receive message #

API

  • ReceiveMessage(queue_id, max_messages, visibility_timeout)

Initiator

  • consumer/client

Entry point

  • gateway or any queue node

Authoritative decider

  • current shard leader(s) serving that queue

Precondition

  • message chosen must currently be VISIBLE

Transition

  • selected message:
    • VISIBLE -> IN_FLIGHT(receipt_handle, consumer_id?, expiry)
  • remove from visible index
  • insert into expiry index

Response

  • {messages: [{message_id, payload, receipt_handle, visibility_timeout_expiry}]}

Failure cases

  • no visible message -> empty response / long poll timeout
  • stale owner -> retry after redirect
  • duplicate receive later is allowed if timeout expires

Sequence

  1. consumer sends ReceiveMessage
  2. entry node resolves shard(s)
  3. shard leader selects visible messages
  4. for each selected message, leader creates new receipt handle/epoch
  5. leader commits delivery-state transitions
  6. leader returns messages + receipt handles

3. Delete message #

API

  • DeleteMessage(queue_id, receipt_handle)

Initiator

  • consumer/client

Entry point

  • gateway or any queue node

Authoritative decider

  • shard leader owning the message

Precondition

  • current DeliveryState is IN_FLIGHT
  • receipt_handle matches current in-flight holder/epoch

Transition

  • IN_FLIGHT -> DELETED

Response

  • success / no-op failure

Failure cases

  • stale receipt handle -> reject
  • message already re-visible and re-claimed -> reject stale delete

Sequence

  1. consumer sends DeleteMessage
  2. entry node routes to owning shard
  3. leader validates current receipt handle
  4. leader commits DELETED transition
  5. leader removes message from expiry tracking
  6. response returned

4. Timeout re-visibility #

API

  • internal background process

Initiator

  • system

Entry point

  • shard leader

Authoritative decider

  • shard leader

Precondition

  • message is still IN_FLIGHT
  • expiry <= now
  • receipt handle/epoch still current

Transition

  • IN_FLIGHT -> VISIBLE
  • remove from expiry index
  • add back to visible index

Response

  • internal success

Failure cases

  • message already deleted -> no-op
  • message handle changed -> stale timeout worker no-op

Sequence

  1. leader scans nearest expiry bucket / min-heap
  2. finds expired in-flight messages
  3. validates state still matches current receipt handle/epoch
  4. commits VISIBLE transition
  5. reinserts into visible queue

5. Route to shard owner #

API

  • internal lookup:
    • ResolveShard(queue_id, message_key?)

Initiator

  • entry node

Entry point

  • local router

Authoritative decider

  • authoritative shard map

Precondition

  • routing table reasonably current

Transition

  • none

Response

  • {shard_id, leader_node}

Failure cases

  • stale map -> redirect/retry

6. Reassign shard ownership #

API

  • internal failover flow:
    • AcquireShardLease(shard_id)
    • or Raft leader election

Initiator

  • system

Entry point

  • follower/candidate node or coordination layer

Authoritative decider

  • shard quorum or metadata lease store

Precondition

  • current owner failed or relinquished
  • candidate replica sufficiently current

Transition

  • new owner/leader epoch established
  • old owner fenced

Response

  • updated ownership metadata

Failure cases

  • split election
  • stale owner resumes and must be rejected

Sequence

  1. leader fails or loses lease
  2. follower/quorum elects new leader
  3. ownership metadata updated
  4. new leader resumes receive/delete/send
  5. old leader fenced by higher epoch/term

Entry Point vs Decider vs Responder #

PathEntry pointAuthoritative deciderPhysical responderLogical responder
SendMessagegateway / any nodeshard leaderleader or front nodequeue service
ReceiveMessagegateway / any nodeshard leaderleader or front nodequeue service
DeleteMessagegateway / any nodeshard leaderleader or front nodequeue service
timeout recoveryshard leadershard leaderinternalqueue service
shard failoverfollower / coordination layershard quorum / lease storenew leader / control planequeue service

Concrete HLD #

Main components:

  • gateway/router
    • resolves queue/shard
    • forwards client requests
  • shard leader
    • authoritative owner of message lifecycle for shard
    • manages visible and in-flight indexes
  • shard followers
    • replicate shard mutations
  • metadata/control service
    • tracks shard ownership and routing
  • background timeout worker
    • usually runs on shard leader using expiry index

Concrete Technology Realizations #

Stronger infra-native answer #

  • Go or Java queue service
  • RocksDB or log-segment files for durable message and state storage
  • Raft or leader-follower replication per shard
  • etcd or internal metadata quorum for shard ownership/routing

Simpler pragmatic variant #

  • broker service plus persistent store
  • internal receipt-handle state in DB/LSM
  • still require leader/shard ownership and timeout scanning

Short interview version:

“I’d build the SQS-class queue as a sharded service with one leader per shard. Enqueue appends an immutable message record and marks it visible. Receive is a lease-like claim that makes the message invisible and assigns a receipt handle with an expiry. Delete is a guarded transition that succeeds only for the current receipt handle, and if the timeout expires first, the shard leader safely moves the message back to visible. Replication happens per shard, and failover promotes only a sufficiently current replica.”