AWS SQS Retention Period
AWS SQS Retention Period is one of the main parameters to configure when working with SQS and queues. Let's see what it means and how to configure it best.
AWS Simple Queue Service
AWS Simple Queue Service (SQS) is a distributed messaging service by AWS. It allows you to build resilient, scalable, and performant applications on the cloud. With SQS you are able to build proper asynchronous systems.
What is AWS SQS Retention Period
The retention period describes how much time a message retains within the queue before it gets deleted automatically.
The retention period can be in the interval of 1 minute and 14 days. If a message reaches the time of the retention period SQS deletes the message automatically.
Deletion in SQS can either mean the message will really be deleted or it can mean it will be processed by a Dead Letter Queue (DLQ).
The retention period is calculated based on their first date of being added to the queue and not the last processing time. This is really important to remember.
Default Retention Period
The default retention period is 4 days. That means a message will be in your queue for 4 days before it gets deleted automatically.
Message Lifecycle in SQS
To understand the retention period better let's have a look at the basic message lifecycle for SQS messages.
Here are some common scenarios for SQS messages.
First, a producer sends a message to the queue. The message is now in the state Message Available.
The retention period starts here. That means if no consumer picks up the message during that time the message will be deleted.
Second, a consumer picks up the message. The message is now in the state In-Flight. The visibility timeout starts now. No other consumers can pick up the message for the time of the visibility timeout.
Lambda starts working on a message. It returns successfully and the message will be removed.
The beginning of the flow is mostly the same.
Producer sends a message (Message available)
Retention period starts
Lambda picks up the message (Message In-Flight)
Visibility timeout starts
If the lambda throws an error it depends now on the configuration of the message and the queue.
The message will either be:
Moved to a Dead Letter Queue
If a redrive policy is set the
maximumReceiveCount defines how many times a message will be retried in the original queue. After that number, the message will automatically be forwarded to the DLQ.
If no redrive policy is set the message will automatically be removed (not recommended).
Retention Period Expires
Another common scenario is that the retention period is set too low. That means the worker (e.g. the lambda function) needs a longer time to pick up the message than the configured retention period.
If that happens the message will also be retried, removed, or moved to a DLQ. Depending on the configuration of the redrive policy.
Move Message to DLQ when the retention period has expired
You can configure a Dead Letter Queue (DLQ) to handle expired messages. For example, you can back them up to S3 or let them notify in case a message is expiring.
You need to configure a redrive policy to move messages to a DLQ. The redrive policy has two main characteristics:
Maximum receive count: How many times should the message be retried in the original queue
DLQ Target ARN: The ARN of the DLQ
If both are configured the message will be moved to a DLQ after the maximum receive count.
The redrive policy can be configured in either the UI or a framework such as CDK.
What happens when the retention period expires?
The message will either be removed, retried or moved to a DLQ. Depending on your configuration of the redrive policy.
That's it 🎉
I hope that clarifies everything about the retention period. If you have any questions let me know!