Description
Key information
- RFC PR: feat: Idempotency helper utility powertools-lambda-python#245
- Related issue(s), if known:
- Area: Utilities
- Meet tenets: Yes
Summary
Helper to facilitate writing Idempotent Lambda functions.
The developer would specify (via JMESPath) which value from the event will be used as a unique execution identifier, then this helper would search a persistence layer (e.g. DynamoDB) for that ID; if present, get the return value and skip the function execution, otherwise run the function normally and persist the return value + execution ID.
Motivation
Idempotency is a very useful design characteristic of any system. It enables the seamless separation of successful and failed executions, and is particularly useful in Lambdas used by AWS Step Functions. It is also a design principle on the AWS Well Architected Framework - Serverless Lens
Broader description of this idea can be found here
Proposal
Define a Python Decorator @idempotent
which would receive as arguments a) the JMESPath of the event key to use as execution ID, b) {optional} storage backend configuration, e.g. DynamoDB table name, or ElasticSearch URL + Index).
This decorator would wrap the function execution in the following way (pseudo-python):
from aws_lambda_powertools.itempotent import PersistenceLayer
def idempotent(func, event_key, persistence_config):
def wrapper(*args, **kwargs):
persistence = PersistenceLayer(persistence_config)
key = jmespath.find(event_key, **kwargs['event'])
persistence.find(key)
if persistence.executed_successfully():
return persistence.result()
try:
result = func(*args, **kwargs)
persistence.save_success(key, result)
return result
except Exception => e:
persistence.save_error(key, e)
return wrapper
Usage then would be similar to:
from aws_lambda_powertools.itempotent import itempotent
@idempotent(event_key='Event.UniqueId', persistence_config='dynamodb://lambda-idemp-table')
def handler(event, context):
# Normal function code here
return {'result': 'OK', 'message': 'working'}
The decorator would first extract the unique execution ID from the Lambda event using the JMESPath provided, then check the persistence layer for a previous successfull execution of the function and - if found - get the previous returned value, de-serialize it (using base64 or something else) and return it instead; otherwise, execute the function handler normally, catch the returned object, serialize + persist it and finally return.
The Persistence layer could be implemented initially with DynamoDB, and either require the DDB table to exist before running the function, or create it during the first execution. It should be in such way as to allow different backends in the future (e.g. Redis for VPC-enabled lambdas).
Drawbacks
This solution could have noticeable performance impacts on the execution of Lambda functions. Every execution would require at at least 1, at most 2 accesses to the persistence layer.
No additional dependencies are required - DynamoDB access is provided by boto3, object serialisation can use Python's native base64encode/decode
Rationale and alternatives
-
What other designs have been considered? Why not them?
No other designs considered at the moment. Open to suggestions. -
What is the impact of not doing this?
Implemention of idempotent Lambda functions will have to be done 'manually' in every function.
Unresolved questions
- How to make the persistence layer access as fast as possible?
- Which other persistence layers to consider (DynamoDB, ElasticSearch, Redis, MySQL)?