Skip to content

RFC: Function Idempotency Helper #28

Closed
@igorlg

Description

@igorlg

Key information

Summary

Helper to facilitate writing Idempotent Lambda functions.
The developer would specify (via JMESPath) which value from the event will be used as a unique execution identifier, then this helper would search a persistence layer (e.g. DynamoDB) for that ID; if present, get the return value and skip the function execution, otherwise run the function normally and persist the return value + execution ID.

Motivation

Idempotency is a very useful design characteristic of any system. It enables the seamless separation of successful and failed executions, and is particularly useful in Lambdas used by AWS Step Functions. It is also a design principle on the AWS Well Architected Framework - Serverless Lens

Broader description of this idea can be found here

Proposal

Define a Python Decorator @idempotent which would receive as arguments a) the JMESPath of the event key to use as execution ID, b) {optional} storage backend configuration, e.g. DynamoDB table name, or ElasticSearch URL + Index).

This decorator would wrap the function execution in the following way (pseudo-python):

from aws_lambda_powertools.itempotent import PersistenceLayer

def idempotent(func, event_key, persistence_config):
  def wrapper(*args, **kwargs):
    persistence = PersistenceLayer(persistence_config)
    key = jmespath.find(event_key, **kwargs['event'])
    persistence.find(key)

    if persistence.executed_successfully():
      return persistence.result()

    try:
      result = func(*args, **kwargs)
      persistence.save_success(key, result)
      return result
    except Exception => e:
      persistence.save_error(key, e)

  return wrapper

Usage then would be similar to:

from aws_lambda_powertools.itempotent import itempotent

@idempotent(event_key='Event.UniqueId', persistence_config='dynamodb://lambda-idemp-table')
def handler(event, context):
  # Normal function code here
  return {'result': 'OK', 'message': 'working'}

The decorator would first extract the unique execution ID from the Lambda event using the JMESPath provided, then check the persistence layer for a previous successfull execution of the function and - if found - get the previous returned value, de-serialize it (using base64 or something else) and return it instead; otherwise, execute the function handler normally, catch the returned object, serialize + persist it and finally return.

The Persistence layer could be implemented initially with DynamoDB, and either require the DDB table to exist before running the function, or create it during the first execution. It should be in such way as to allow different backends in the future (e.g. Redis for VPC-enabled lambdas).

Drawbacks

This solution could have noticeable performance impacts on the execution of Lambda functions. Every execution would require at at least 1, at most 2 accesses to the persistence layer.

No additional dependencies are required - DynamoDB access is provided by boto3, object serialisation can use Python's native base64encode/decode

Rationale and alternatives

  • What other designs have been considered? Why not them?
    No other designs considered at the moment. Open to suggestions.

  • What is the impact of not doing this?
    Implemention of idempotent Lambda functions will have to be done 'manually' in every function.

Unresolved questions

  • How to make the persistence layer access as fast as possible?
  • Which other persistence layers to consider (DynamoDB, ElasticSearch, Redis, MySQL)?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions