Skip to content

Bug: etag is a required field in S3EventNotificationObjectModel however is not present in DeleteObject events #4156

Closed
@benjamingorman

Description

@benjamingorman

Expected Behaviour

etag should be an Optional field in this pydantic schema

Current Behaviour

Good afternoon chaps.
I'm noticing what I think is an issue with one of the pydantic models in powertools

class S3EventNotificationObjectModel(BaseModel):

class S3EventNotificationObjectModel(BaseModel):
    key: str
    size: Optional[NonNegativeFloat] = None
    etag: str
    version_id: str = Field(None, alias="version-id")
    sequencer: Optional[str] = None

etag here is required but it seems that for S3 DeleteObject events, etag is sometimes not set:

For example, here's an excerpt of an event I received from S3 via eventbridge:

{
    "version": "0",
    "id": "5afe384b-7341-a76f-25ab-29bac47a969a",
    "detail-type": "Object Deleted",
    "source": "aws.s3",
    "account": "REDACTED",
    "time": "REDACTED",
    "region": "eu-west-1",
    "resources": [
        "arn:aws:s3:::my_bucket"
    ],
    "detail": {
        "version": "0",
        "bucket": {
            "name": "my_bucket"
        },
        "object": {
            "key": "REDACTED",
            "sequencer": "REDACTED"
        },
        "request-id": "REDACTED",
        "requester": "REDACTED",
        "source-ip-address": "REDACTED",
        "reason": "DeleteObject",
        "deletion-type": "Permanently Deleted"
    }
}

The pydantic schema is expecting detail.object.etag to be present here, however since this is a deletion event it doesn't exist and the event therefore seems invalid to pydantic
I think the resolution needed here is just to make etag an Optional field in the pydantic schema.
In my code I've tested this by making a custom version of S3EventNotificationObjectModel where etag is Optional and this does resolve my issue.

Code snippet

class MySqsRecordModel(SqsRecordModel):
    """Model for an individual record from the SQS event we received.

    We expect this to be an EventBridge event, which contains an S3 event.
    """

    body: Json[S3EventNotificationEventBridgeModel]


class MySqsEventModel(BaseModel):
    """Model for the SQS event we expect to receive."""

    Records: Sequence[MySqsRecordModel]

def lambda_handler(event: dict, context: LambdaContext) -> dict:
    """Handles the main entry point for the Lambda function."""
    logger.info("Received event", extra={"event": event, "context": context})

    # Attempt to parse the event according to the model we expect
    try:
        parsed_event: MySqsEventModel = parse(
            event=event, model=MySqsEventModel
        )
    except ValidationError as exc:
        logger.error("Failed to parse event", extra={"error": exc})
        raise

Possible Solution

Make etag an optional field in the pydantic schema so validation succeeds

Steps to Reproduce

In my own code I have an S3 bucket with events going to EventBridge. EventBridge forwards to SQS queue and lambda pulls from SQS queue.

All that's needed to observe the issue is to delete an object from the bucket, which creates a DeleteObject event. This event goes from S3 bucket -> EventBridge -> SQS Queue -> Lambda and the issue is visible in the lambda when it tries to parse the event according to schema in the code snippet.

Powertools for AWS Lambda (Python) version

2.37

AWS Lambda function runtime

3.10

Packaging format used

PyPi

Debugging logs

No response

Metadata

Metadata

Labels

bugSomething isn't workingevent_sourcesEvent Source Data Class utilityparserParser (Pydantic) utility

Type

No type

Projects

Status

Shipped

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions