Handling Dataloss in AWS Lambda

AWS Lambda or the fancy server-less environment has become so handy off-late. From reducing costs to scaling, server-less grabbed everyone’s attention. If not used with the right configuration data loss can be a huge player in these environments.

But, why do we even care about data loss here?

Let’s consider this scenario, you are using Lambda to send data to let’s say Splunk. There can be hiccups in the network from Lambda to Splunk at a given time, which results in the loss of data transferred to Splunk at that given point of time.

Losing Data, Oh no there’s your nightmare 😜
Let’s just thank AWS for having a process to back up this data, which follows in this article.
We have a saviour YES!! 😇

So, How is this done?

Asynchronous configuration in Lambda makes it possible to invoke lambda’s asynchronously and not waiting for the responses. There are three inputs that you need to know to enable this feature.
Maximum age of event – This the amount of time an unprocessed event is kept in the queue.
Retry attempts – Must be 0-2. The number of retry attempts when an event fails.
Dead letter queue service – Unprocessed events can be sent to SNS or SQS through this config.

Demonstration with an example

Let’s consider the following inputs –
Maximum age of event – 5 mins
Retry attempts – 1
Dead letter queue service – SQS
This means, whenever there is an error in Lambda, the event is stored in its queue for 5mins, retries once and if it’s still unsuccessful then it is sent to SQS.
All the data which you thought you would lose when the connection between Lambda and Splunk is interrupted is now in SQS.

Bonus

Now that you have all the lost data in SQS. Create a Lambda, which runs for every 5/10 minutes(it’s your choice) that moves the data to and S3. One more manual run Lambda, which transfers the data from S3 to Splunk once the connection from Lambda to Splunk is re-established.

Figure 1. The Whole Flow

There can be numerous ways you can do this, the example you can use kinesis for doing the above example where you wouldn’t need all these custom Lambdas. But, think about the cost there. The process you adopt depends upon the scenario you are trying to solve.

Published by Ritesh Kumar Reddy

I(Ritesh) work as a Sr. Cloud Engineer for a living. Learning new technologies has always been my hobby. Why not share it? Here is the brainchild – blogging to share the knowledge. This blog is for those who wish to start or already into the Cloud field. Each article briefly talks about a tool/technology that is used in the Cloud model. Once you read the article, I hope, you get a kick start regarding the specific tool/technology.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: