docs(idempotency): cleanup serialization, fields subset, move batch to new common use cases section

heitorlessa · heitorlessa · commit 427d36537969 · 2024-05-27T13:42:02.000+12:00
Signed-off-by: heitorlessa &lt;lessa@amazon.co.uk&gt;
diff --git a/docs/utilities/idempotency.md b/docs/utilities/idempotency.md
@@ -191,11 +191,11 @@ By default, `idempotent_function` serializes, stores, and returns your annotated
 
 The output serializer supports any JSON serializable data, **Python Dataclasses** and **Pydantic Models**.
 
-!!! info "When using the `output_serializer` parameter, the data will continue to be stored in DynamoDB as a JSON object."
+!!! info "When using the `output_serializer` parameter, the data will continue to be stored in DynamoDB as a JSON string."
 
 === "Pydantic"
 
-    You can use `PydanticSerializer` to automatically serialize what's retrieved from the persistent storage based on the return type annotated.
+    Use `PydanticSerializer` to automatically serialize what's retrieved from the persistent storage based on the return type annotated.
 
     === "Inferring via the return type"
 
@@ -215,7 +215,7 @@ The output serializer supports any JSON serializable data, **Python Dataclasses*
 
 === "Dataclasses"
 
-     You can use `DataclassSerializer` to automatically serialize what's retrieved from the persistent storage based on the return type annotated.
+     Use `DataclassSerializer` to automatically serialize what's retrieved from the persistent storage based on the return type annotated.
 
     === "Inferring via the return type"
 
@@ -235,7 +235,7 @@ The output serializer supports any JSON serializable data, **Python Dataclasses*
 
 === "Any type"
 
-    You can use `CustomDictSerializer` to have full control over the serialization process for any type. It expects two functions:
+    Use `CustomDictSerializer` to have full control over the serialization process for any type. It expects two functions:
 
     * **to_dict**. Function to convert any type to a JSON serializable dictionary before it saves into the persistent storage.
     * **from_dict**. Function to convert from a dictionary retrieved from persistent storage and serialize in its original form.
@@ -248,42 +248,20 @@ The output serializer supports any JSON serializable data, **Python Dataclasses*
     2. This function does the following <br><br>**1**. Receives the dictionary saved into the persistent storage <br>**1** Serializes to `OrderOutput` before `@idempotent` returns back to the caller.
     3. This serializer receives both functions so it knows who to call when to serialize to and from dictionary.
 
-#### Batch integration
-
-You can can easily integrate with [Batch utility](batch.md){target="_blank"} via context manager. This ensures that you process each record in an idempotent manner, and guard against a [Lambda timeout](#lambda-timeouts) idempotent situation.
-
-???+ "Choosing an unique batch record attribute"
-    In this example, we choose `messageId` as our idempotency key since we know it'll be unique.
-
-    Depending on your use case, it might be more accurate [to choose another field](#choosing-a-payload-subset-for-idempotency) your producer intentionally set to define uniqueness.
-
-=== "Integration with Batch Processor"
-
-    ```python hl_lines="2 12 16 20 31 35 37"
-    --8<-- "examples/idempotency/src/integrate_idempotency_with_batch_processor.py"
-    ```
-
-=== "Sample event"
-
-    ```json hl_lines="4"
-    --8<-- "examples/idempotency/src/integrate_idempotency_with_batch_processor_payload.json"
-    ```
-
 ### Choosing a payload subset for idempotency
 
 ???+ tip "Tip: Dealing with always changing payloads"
     When dealing with a more elaborate payload, where parts of the payload always change, you should use **`event_key_jmespath`** parameter.
 
-Use [`IdempotencyConfig`](#customizing-the-default-behavior) to instruct the idempotent decorator to only use a portion of your payload to verify whether a request is idempotent, and therefore it should not be retried.
+Use [`IdempotencyConfig`](#customizing-the-default-behavior)'s **`event_key_jmespath`** parameter to select one or more payload parts as your idempotency key.
 
 > **Payment scenario**
 
 In this example, we have a Lambda handler that creates a payment for a user subscribing to a product. We want to ensure that we don't accidentally charge our customer by subscribing them more than once.
 
-Imagine the function executes successfully, but the client never receives the response due to a connection issue. It is safe to retry in this instance, as the idempotent decorator will return a previously saved response.
+Imagine the function runs successfully, but the client never receives the response due to a connection issue. It is safe to immediately retry in this instance, as the idempotent decorator will return a previously saved response.
 
-**What we want here** is to instruct Idempotency to use `user_id` and `product_id` fields from our incoming payload as our idempotency key.
-If we were to treat the entire request as our idempotency key, a simple HTTP header change would cause our customer to be charged twice.
+**We want** to use `user_id` and `product_id` fields as our idempotency key. If we were to treat the entire request as our idempotency key, a simple HTTP header change would cause our function to run again.
 
 ???+ tip "Deserializing JSON strings in payloads for increased accuracy."
     The payload extracted by the `event_key_jmespath` is treated as a string by default.
@@ -472,6 +450,29 @@ You can customize attribute names when instantiating `RedisCachePersistenceLayer
     --8<-- "examples/idempotency/src/customize_persistence_layer_redis.py"
     ```
 
+### Common use cases
+
+#### Batch integration
+
+You can can easily integrate with [Batch](batch.md){target="_blank"} with the [idempotent_function decorator](#idempotent_function-decorator) to handle idempotency per message/record in a given batch.
+
+???+ "Choosing an unique batch record attribute"
+    In this example, we choose `messageId` as our idempotency key since we know it'll be unique.
+
+    Depending on your use case, it might be more accurate [to choose another field](#choosing-a-payload-subset-for-idempotency) your producer intentionally set to define uniqueness.
+
+=== "Integration with Batch Processor"
+
+    ```python title="integrate_idempotency_with_batch_processor.py" hl_lines="3 16 19 25 27"
+    --8<-- "examples/idempotency/src/integrate_idempotency_with_batch_processor.py"
+    ```
+
+=== "Sample event"
+
+    ```json title="integrate_idempotency_with_batch_processor_payload.json" hl_lines="4"
+    --8<-- "examples/idempotency/src/integrate_idempotency_with_batch_processor_payload.json"
+    ```
+
 ### Idempotency request flow
 
 The following sequence diagrams explain how the Idempotency feature behaves under different scenarios.
diff --git a/examples/idempotency/src/integrate_idempotency_with_batch_processor.py b/examples/idempotency/src/integrate_idempotency_with_batch_processor.py
@@ -1,5 +1,6 @@
-from aws_lambda_powertools import Logger
-from aws_lambda_powertools.utilities.batch import BatchProcessor, EventType
+import os
+
+from aws_lambda_powertools.utilities.batch import BatchProcessor, EventType, process_partial_response
 from aws_lambda_powertools.utilities.data_classes.sqs_event import SQSRecord
 from aws_lambda_powertools.utilities.idempotency import (
     DynamoDBPersistenceLayer,
@@ -8,13 +9,11 @@
 )
 from aws_lambda_powertools.utilities.typing import LambdaContext
 
-logger = Logger()
 processor = BatchProcessor(event_type=EventType.SQS)
 
-dynamodb = DynamoDBPersistenceLayer(table_name="IdempotencyTable")
-config = IdempotencyConfig(
-    event_key_jmespath="messageId",  # see Choosing a payload subset section
-)
+table = os.getenv("IDEMPOTENCY_TABLE")
+dynamodb = DynamoDBPersistenceLayer(table_name=table)
+config = IdempotencyConfig(event_key_jmespath="messageId")
 
 
 @idempotent_function(data_keyword_argument="record", config=config, persistence_store=dynamodb)
@@ -25,13 +24,9 @@ def record_handler(record: SQSRecord):
 def lambda_handler(event: SQSRecord, context: LambdaContext):
     config.register_lambda_context(context)  # see Lambda timeouts section
 
-    # with Lambda context registered for Idempotency
-    # we can now kick in the Bach processing logic
-    batch = event["Records"]
-    with processor(records=batch, handler=record_handler):
-        # in case you want to access each record processed by your record_handler
-        # otherwise ignore the result variable assignment
-        processed_messages = processor.process()
-        logger.info(processed_messages)
-
-    return processor.response()
+    return process_partial_response(
+        event=event,
+        context=context,
+        processor=processor,
+        record_handler=record_handler,
+    )