Event-Driven Architecture with Azure Service Bus
Tightly coupled services are fragile. If Service A calls Service B synchronously and Service B is unavailable, Service A's operation fails. If Service A's order processing system calls the inventory service, the notification service, and the fulfilment service in sequence, a failure in any one of them fails the whole order. The caller waits for every dependency; every dependency's failure becomes the caller's problem.
Event-driven architecture decouples this. Service A emits an event (an order was placed) and continues. Every other service that cares about that event processes it independently, in its own time, without Service A needing to know they exist. Azure Service Bus is the message broker that makes this work reliably: it stores the event until each interested service has processed it, ensures delivery even when consumers are temporarily unavailable, and provides the guarantee that no event is lost.
Queues vs topics: one consumer vs many
Service Bus supports two messaging models:
Queues deliver each message to exactly one consumer. When multiple instances of the same consumer are running (horizontal scaling), Service Bus distributes messages across them in a competing-consumer pattern. Use queues when one service processes each work item.
Topics and subscriptions deliver each message to multiple consumers. A topic is the publish-subscribe equivalent: a producer sends a message once to the topic, and each subscription gets its own copy. Each subscription is processed by one consumer (or one scaled-out consumer group). Use topics when multiple services each need to act on the same event.
For the order processing example:
- Producer: Order Service publishes an order-placed event to the orders topic
- Subscription 1: Inventory Service receives a copy and decrements stock
- Subscription 2: Notification Service receives a copy and emails the customer
- Subscription 3: Fulfilment Service receives a copy and creates a pick list
All three run independently. If the Notification Service is temporarily unavailable, messages accumulate in its subscription. When it recovers, it processes them. The order operation completed successfully and the other services were unaffected.
Subscription filters
Not every subscription needs every message. Service Bus subscription filters let a subscription receive only messages that match a condition. This avoids the alternative pattern where every service receives every message and discards the ones it does not care about.
Filters operate on message properties (the application-level metadata, not the body):
SQL filter: An expression evaluated against the message's user-defined properties. A fulfilment service subscription might filter for OrderType = 'physical' to receive only orders that require physical fulfilment, while a digital delivery subscription filters for OrderType = 'digital'.
Correlation filter: A highly efficient equality match on specific properties. For high-throughput scenarios, correlation filters are more performant than SQL filters.
Boolean filter: TrueFilter() (all messages) or FalseFilter() (no messages). Used for routing setup.
To set a filter via Azure CLI:
az servicebus topic subscription rule create \
--resource-group myRG \
--namespace-name myNamespace \
--topic-name orders \
--subscription-name fulfilment \
--name physical-orders-only \
--filter-sql-expression "OrderType = 'physical'"
Filters are evaluated on the broker; only matching messages are delivered to the subscription. Non-matching messages are discarded (or forwarded to another subscription if routing rules are configured).
Message sessions for ordered processing
By default, Service Bus delivers messages to consumers in no guaranteed order. Multiple competing consumers process messages as they arrive, which maximises throughput but loses ordering.
For workflows where order matters (a financial transaction sequence, a state machine that processes steps in a specific order), Service Bus sessions enforce ordering per session. Messages in the same session are delivered to the same consumer, in order.
Set the session ID on the message when publishing:
var message = new ServiceBusMessage(Encoding.UTF8.GetBytes(payload))
{
SessionId = orderId.ToString(), // all messages for the same order go to the same consumer
ContentType = "application/json"
};
await sender.SendMessageAsync(message);
The consumer must use a session receiver rather than a standard receiver:
await using var sessionProcessor = client.CreateSessionProcessor(
topicName, subscriptionName,
new ServiceBusSessionProcessorOptions());
sessionProcessor.ProcessMessageAsync += HandleMessage;
sessionProcessor.ProcessErrorAsync += HandleError;
await sessionProcessor.StartProcessingAsync();
Messages with the same SessionId are processed by the same consumer instance, in order, for the duration of the session. Unrelated sessions are distributed across available consumers for parallelism.
Dead-letter queue: handling failures without losing messages
Processing failures are inevitable. Service Bus's dead-letter queue (DLQ) is the safety net: when a message fails processing more than the configured maximum delivery count, or when a message expires without being delivered, Service Bus moves it to the dead-letter sub-queue rather than discarding it.
The DLQ is accessible at {queue}/$deadletterqueue or {topic}/Subscriptions/{subscription}/$deadletterqueue. Inspect it like any other queue or subscription.
Each dead-lettered message carries a reason property explaining why it was dead-lettered: MaxDeliveryCountExceeded, MessageLockLost, TTLExpiredException, or a custom reason set by the consumer when explicitly dead-lettering a message it cannot process.
What to do with dead-lettered messages:
For MaxDeliveryCountExceeded: a message that failed processing N times (default: 10) suggests a bug in the consumer or a message that the consumer cannot handle. Fix the consumer bug, then replay the dead-lettered messages by moving them back to the main queue.
For TTLExpiredException: a message that sat in the subscription longer than its time-to-live. Review whether the TTL is appropriate for the message type, and whether consumer throughput is keeping up with message production rate.
Monitor DLQ depth as an operational metric. A growing DLQ is a signal of a consumer processing problem that warrants investigation. Set an alert when DLQ depth exceeds zero for your critical subscriptions.
Idempotent consumers
Service Bus delivers messages at least once. In failure scenarios (consumer crashes after processing but before completing the message), the same message may be delivered twice. Your consumer must handle this safely.
Design consumers to be idempotent: processing the same message twice produces the same result as processing it once. Common patterns: - Check whether the operation has already been applied before applying it (check whether the inventory was already decremented for this order ID) - Use database upserts rather than inserts (insert or update, not insert-or-fail) - Record processed message IDs in a deduplication store and skip reprocessing if already seen
Service Bus Premium tier offers built-in duplicate detection via a message ID. Set a unique message ID on each message; Service Bus discards messages with the same ID received within the duplicate detection window (configurable up to 7 days). This handles the most common cause of duplicate delivery (network retransmission) without requiring application-level deduplication.
Monitoring Service Bus
Operational metrics for Service Bus to monitor continuously: - Active message count: per queue/subscription. Growing count means consumers are falling behind. - Dead-letter message count: any non-zero value warrants investigation. - Server errors and throttling: Service Bus throttles at namespace-level throughput limits. Premium tier increases limits substantially. - Message size distribution: messages approaching the 256 KB (Standard) or 100 MB (Premium) limit.
Route Service Bus diagnostic metrics to Log Analytics for querying:
AzureDiagnostics
| where ResourceType == "SERVICEBUS"
| where MetricName == "DeadletteredMessages"
| summarize max(Total) by SubscriptionName_s, bin(TimeGenerated, 5m)
| where max_Total > 0
| order by TimeGenerated desc
Where Critical Cloud comes in
Service Bus-backed event-driven architectures are reliable when they are built correctly and operated with the right telemetry. They fail silently when DLQ depth grows unmonitored, when consumers do not handle idempotency, or when session ordering assumptions break down under scale. We design and operate Azure messaging architectures for technology-led businesses. As the world's first Powered by Datadog accredited partner, we monitor queue depths, consumer lag, and dead-letter counts as live operational signals. See how Critical Support works.