- My Development Notes/
- AMQP 0-9-1: The Complete Protocol/
- Publisher Confirms — Delivery Confirmation from Broker to Publisher/
Publisher Confirms — Delivery Confirmation from Broker to Publisher
Table of Contents
Publisher Confirms — Delivery Confirmation from Broker to Publisher #
Consumer acknowledgments (Chapter 9) tell the broker that a consumer processed a message. Publisher confirms go in the opposite direction: they tell the producer that the broker has received and persisted the message. Without publisher confirms, a producer cannot know whether a published message was actually accepted by the broker.
The Problem Publisher Confirms Solve #
Without confirms, a producer publishes and immediately moves on:
Producer → basic.publish → Broker
← no acknowledgment →
Producer assumes delivery... but did it succeed?
If the broker is under memory pressure and rejects the publish, the producer does not know. If the TCP connection drops after the producer sends the frame but before the broker writes to disk, the message is lost. The producer has no way to distinguish success from silent failure.
What can go wrong:
- Broker exceeds memory high-watermark → new publishes blocked or dropped.
- Message published to non-existent exchange → routed nowhere (discarded silently unless
mandatory=True). - Broker crashes between receiving message and persisting to disk → message lost.
- TCP buffer overflow → frame silently dropped.
Publisher confirms provide the feedback loop: the broker sends an ack (or nack) for each confirmed message, giving the producer actionable confirmation.
Enabling Confirm Mode #
Confirm mode is enabled per channel:
channel.confirm_delivery()
# or in the raw AMQP method:
channel.tx_select() # transactions (not confirms)
In pika:
channel.confirm_delivery()
In RabbitMQ Java client:
channel.confirmSelect();
After confirm.select, the channel is in confirm mode. Every basic.publish on this channel will receive an async ack or nack from the broker. Confirm mode and transaction mode (tx.select) are mutually exclusive on a channel.
The confirm.select protocol:
Client → Broker: confirm.select (no-wait=False)
Broker → Client: confirm.select-ok
--- channel now in confirm mode ---
Client → Broker: basic.publish (delivery_tag=1, tracked internally)
Client → Broker: basic.publish (delivery_tag=2)
Broker → Client: basic.ack (delivery_tag=2, multiple=True)
← both messages 1 and 2 acknowledged →
Every basic.publish on a confirm-mode channel has an implicit delivery tag — a monotonically increasing sequence number, independent of consumer delivery tags. The broker acknowledges with basic.ack or basic.nack using this sequence number.
What the Broker Acks and Nacks #
Ack (basic.ack from broker to producer):
- For persistent messages on durable queues: the broker has written the message to disk and all HA mirrors/quorum replicas have confirmed receipt.
- For non-persistent messages: the broker has accepted the message into memory.
- For messages on transient queues: the broker has accepted the message.
Nack (basic.nack from broker to producer):
- The broker could not accept the message. This is rare and typically indicates an internal broker error.
- A message published with
mandatory=Truethat matched no binding is returned viabasic.return, not nacked. The nack is for internal broker failures, not routing failures.
Timing: for persistent messages on quorum queues, the ack is sent after Raft consensus — after a majority of quorum replicas have written the message. This provides strong durability guarantees but higher latency than non-replicated queues.
Synchronous Confirms #
The simplest confirm strategy: publish one message, wait for the ack before publishing the next.
channel.confirm_delivery()
for message in messages:
channel.basic_publish(
exchange='orders',
routing_key='order.created',
body=message,
properties=pika.BasicProperties(delivery_mode=2)
)
# Wait for ack (blocks until broker confirms)
# pika's BlockingConnection handles this via process_data_events
In pika’s BlockingConnection, confirm_delivery() makes basic_publish synchronous — it blocks until the broker acks or raises an exception on nack.
Throughput: synchronous confirms achieve roughly one message per round-trip latency. For a 1ms round-trip, that is ~1000 messages/second. Slow for high-throughput applications but simple to reason about.
Asynchronous Confirms #
The high-throughput strategy: publish in batches, collect acks asynchronously, track which messages are outstanding.
import threading
class AsyncConfirmProducer:
def __init__(self, channel):
self.channel = channel
self.outstanding = {} # delivery_tag → message
self.lock = threading.Lock()
channel.add_on_return_callback(self._on_return)
channel.confirm_delivery(self._on_confirm)
def publish(self, exchange, routing_key, body, properties):
delivery_tag = self.channel.get_next_publish_seq_no()
self.channel.basic_publish(exchange, routing_key, body, properties)
with self.lock:
self.outstanding[delivery_tag] = (exchange, routing_key, body, properties)
def _on_confirm(self, method):
if method.multiple:
# Ack/nack all tags up to and including method.delivery_tag
with self.lock:
to_remove = [t for t in self.outstanding if t <= method.delivery_tag]
for tag in to_remove:
del self.outstanding[tag]
else:
with self.lock:
self.outstanding.pop(method.delivery_tag, None)
if isinstance(method, pika.spec.Basic.Nack):
# Handle nack: log, retry, or dead-letter
self._handle_nack(method.delivery_tag)
def _on_return(self, channel, method, properties, body):
# Message was returned (mandatory=True, no route)
self._handle_unroutable(method, body)
The outstanding message map: the producer maintains a map from delivery_tag to message. When the broker acks (possibly with multiple=True), acknowledged entries are removed. On nack, the producer retries or dead-letters.
Bulk acks: the broker frequently sends acks with multiple=True, acknowledging all outstanding messages up to and including the given delivery_tag. This reduces ack traffic significantly.
Handling Returns (mandatory=True) #
When publishing with mandatory=True, the broker returns the message via basic.return if no binding matches:
channel.add_on_return_callback(on_return)
channel.basic_publish(
exchange='events',
routing_key='unknown.type',
body=b'message',
mandatory=True # return if unroutable
)
def on_return(channel, method, properties, body):
# method.reply_code: 312 (NO_ROUTE)
# method.reply_text: 'NO_ROUTE'
# method.exchange, method.routing_key: original routing info
print(f"Unroutable message: {method.routing_key}")
# Save to DLQ, alert, or retry with correct routing key
basic.return is sent before basic.ack. The producer receives both for a returned message:
basic.return(the message could not be routed)basic.ack(the broker has processed the publish — it was returned, not lost)
Without mandatory=True, unroutable messages are silently discarded. This is a common source of silent message loss in misconfigured topologies.
Confirms vs Transactions #
AMQP 0-9-1 has both transactions (tx.select / tx.commit / tx.rollback) and confirms. They are not the same:
| Aspect | Transactions | Publisher confirms |
|---|---|---|
| Protocol | AMQP 0-9-1 standard | RabbitMQ extension (now widely supported) |
| Mode | Synchronous | Asynchronous |
| Batch | Commit groups publishes atomically | Acks are per-message |
| Rollback | Yes — unpublish batched messages | No — no rollback |
| Throughput | ~20x slower than unconfirmed | ~2-5x slower than unconfirmed |
| Durability guarantee | Same as confirms for durable queues | Same as transactions for durable queues |
Why transactions are slow: tx.commit is synchronous — the broker fsync’s all committed messages before responding. This is one fsync per commit, per channel. Confirms can batch multiple messages into one fsync cycle.
RabbitMQ’s recommendation: use publisher confirms for durability. Transactions are legacy; confirms are the modern approach.
Confirm-Mode Best Practices #
1. Track outstanding message count: if the outstanding map grows unboundedly (the broker is slow or disconnected), the producer’s memory usage grows. Implement a backpressure threshold:
MAX_OUTSTANDING = 1000
if len(outstanding) >= MAX_OUTSTANDING:
# Pause publishing; wait for acks to drain
wait_for_confirms()
2. Handle nacks by retrying: nacks are rare but must be handled. A simple strategy: re-enqueue nacked messages to a local retry queue and re-publish after a brief delay.
3. Close channel on unknown nack: if the broker nacks messages for reasons the producer doesn’t understand, close the channel and reconnect. This may indicate a broker overload or configuration issue.
4. Set delivery-mode=2 for confirmed messages: confirms without persistence are half-measures. A broker that acks a non-persistent message has accepted it into memory — but crashes before the consumer reads it. For true durability:
- Durable queue
- Persistent message (delivery-mode=2)
- Publisher confirms
All three are required for crash-safe messaging.
5. Confirms on quorum queues: a quorum queue’s ack is sent after Raft majority write — stronger than a classic durable queue’s ack (which acks after single-node write). For the highest durability guarantees, use quorum queues with confirms.
Confirms and Channel Recovery #
When a channel closes (connection drop, error), all unconfirmed publishes are unresolved — the producer does not know if they were committed before the close. On reconnect, the producer should assume the unconfirmed messages were not committed and re-publish them. This may result in duplicate messages.
The at-least-once pattern for confirmed publishing:
publish → wait for ack → confirmed: remove from pending
→ nack: retry
channel closes → all pending assumed lost → re-publish all pending
consumer is idempotent → duplicate handling at consumer side
This is the production-grade pattern for AMQP producers.
Summary #
Publisher confirms give producers actionable feedback on message acceptance. A broker ack for a persistent message on a durable queue means the message is on disk (for classic queues) or committed to a Raft majority (for quorum queues). Asynchronous confirms with an outstanding-message tracking map maximize throughput while maintaining safety. mandatory=True with a return callback catches unroutable messages. Transactions are a slower, less flexible alternative. Combines with durable queues + persistent messages for crash-safe end-to-end guarantees.