Description
Use case
As a user of the Batch Processing feature, I often want to process the batch of events the lambda recieves, in batches (like mapping over the records) rather than a single record at a time.
Currently, with the way things are set up, the user returns a processPartialResponse() function from the lambda.
The first parameter of the processPartialResponse() function is the event object containing the entire batch of records passed to the lambda, while the second parameter of the function, recordHandler, is a function that's used to process each record in the batch:
Since the recordHandler() function can only be passed a single record at a time, this prevents the user from being able to process the entire batch and forward requests concurrently.
For example, let's say I have a lambda function that receives batches of dynamo db stream records, and for each record I want my lambda to do some processing, and then make a put request to write an item to a table...
-
With the current behavior: If the lambda receives 100 dynamo db stream records, the recordHandler() will have to process each item and make a separate put request for each record (100 put item requests). Since the recordHandler() function only processes a single item at a time, I can't batch those 100 put item requests together into 4 BatchWriteItem requests.
-
With the desired behavior: The user should have the option for passing in a recordHandler() function that can either process one record at a time, or an option to pass in a recordHandler() function that can process all records at one time, so they have the flexibility to process the records concurrently and make batch requests to down stream services.
For an example of the desired behavior, see pic below:
The recordHandler() receives an array of all the records at once (DynamoDBRecord[]), a function maps over them and returns an item for each record, then turns each of those items into a put request, batches those put requests into arrays containing 25 put requests each, then writes each batch of 25 put requests to dynamodb using the batchWriteItem API.
Solution/User Experience
Extend/modify the processPartialResponse() function to accept:
- A recordHandler() function that receives and processes one event at a time (current behavior)
- OR a recordHandler() function that receives all the passed in records from the lambda event, so the records can be processed concurrently and batched into write requests. So for example a lambda that receives an event containing 100 dynamodb stream records, should have the ability to process those records into 4 batchWriteRequests consisting of 25 put requests each.
Alternative solutions
No response
Acknowledgment
- This feature request meets Powertools for AWS Lambda (TypeScript) Tenets
- Should this be considered in other Powertools for AWS Lambda languages? i.e. Python, Java, and .NET
Future readers
Please react with 👍 and your use case to help us understand customer demand.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status