Skip to content

RFC: Mechanism for end2end testing #1226

Closed
@mploski

Description

@mploski

Is this related to an existing feature request or issue?

#1009

Which AWS Lambda Powertools utility does this relate to?

Other

Summary

Build mechanism to run an end to end tests on Lambda Powertools library using real AWS Services (Lambda, DynamoDB).
Initially, tests can be run manually by maintainers on specific branch/commit_id to ensure expected feature works.
Tests should be triggered in GitHub but also maintainers/contributors should be able to run them in their local environment using their own AWS Account.

Use case

Providing mechanism to run end to end tests in real-live environment allows us to discover different class of problems we cannot find out otherwise by running unit tests or integration tests. For example, how the code base behaves during Lambda during cold and warm start, event source misconfiguration, IAM permissions, etc. It also allow us to validate integration with external services (CloudWatch Logs, X-ray) and ensure final real user experience is what we expect.

When it should be used

  • Test feature from end user perspective
  • Test external integration with AWS Services and applied policies this mechanism
  • Test event source configurations and/or combinations
  • Test whether our documented IAM permissions work as expected

Examples

  • Test if structured logs generated by library is visible in AWS CloudWatch and has all necessary attributes
  • Test if generated trace is visible in AWS X-Ray and has all provided metadata and annotation included
  • Test if business metric generated by library is visible in CloudWatch under expected namespace and with expected value

When integration test may be more appropriate instead

Integration testing would be a better fit when we can increase confidence by covering code base -> AWS Service(s). These can give us a faster feedback loop while reducing the permutations of E2E test we might need to cover the end user perspective, permissions, etc.

Examples

  • Test if pure Python function is idempotent and subsequent calls return the same value
  • Test whether Feature Flags can fetch schema
  • Test whether Parameters utility can fetch values from SSM, Secrets Manager, AppConfig, DynamoDB

Proposal

Overview

  • Use CDK SDK (lib) to describe infrastructure: currently lambda + powertools layer
  • Run tests in parallel and separate them by feature directory e.g. metrics/, tracer/
  • Every feature group has a separate infrastructure deployed e.g., metrics stack, tracer stack
  • Enable running them from GitHub Actions and from local machine on specified AWS Account
  • Clean up all resources at the end of the test

What an E2E test would look like

More details in the What's in a test section

Screenshot 2022-05-26 at 11 51 15

Details

Github configuration
  1. Integrate GitHub with AWS Account using OIDC: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
  2. Specify end2end tests commands in Github Actions
Test setup

Tests will follow a common directory structure to allow us to parallelize infrastructure creation and test execution for all feature groups. Tests within a feature group, say tracer/, are run sequentially and independently from other feature groups.

Test fixtures will provide the necessary infrastructure and any relevant information tests need to succeed, e.g. Lambda Function ARN. Helper methods will also be provided to hide integration details and ease test creation.

If there are no more tests, infrastructure resources are automatically cleaned-up and results are synchronized and return to the user.

Directory Structure

tests/e2e
├── conftest.py
├── logger
│   ├── handlers
│   └── test_logger.py
├── metrics
│   ├── handlers
│   └── test_metrics.py
├── tracer
│   ├── handlers
│   └── test_tracer.py
└── utils
    ├── helpers.py
    └── infrastructure.py

Explanation

  1. We keep our end to end tests under tests/e2e directory.
  2. We split tests by groups matching different Powertools features - in this example we have 3 groups (logger, metrics and tracer).
  3. Our test mechanism parallelize tests execution by looking at those groups.
  4. utils directory has utilities to simplify writing tests, and an infrastructure module used for deploying infrastructure

Note: In the first phase we may reuse infrastructure helper class in all test groups. If we decide we need more infrastructure configuration granularity per groups we will create sub-classes from core infra class and overwrite method responsible for describing infrastructure in CDK.

Reasoning

Keeping infrastructure creation module separate from test groups helps in reusing infrastructure along multiple tests within a feature group. It also allows us to benchmark tests and infra separately in the future. It also help contributor to write tests without expectation to dive deep into infra creation mechanism.

General Flow Diagram
graph TD
    A([Start]) -->|Run e2e tests| B[Find all tests group AND parallelize execution]
    B -->|group 1|C1[Deploy infrastructure]
    B --> |group 2|C2[Deploy infrastructure]
    C1 --> F1{Is another test available?}
    F1 --> |no|G1[Destroy infrastructure]
    F1 --> |yes|I1[Run test]
    I1 -->|Find next test|F1
    C2 --> F2{Is another test available?}
    F2 --> |no|G2[Destroy infrastructure]
    F2 --> |yes|I2[Run test]
    I2 -->|Find next test|F2
    G1 --> K[Return results]
    G2 --> K
    K -->L([Stop])
Loading
What's in a test

Sample test using Pytest as our test runner

Screenshot 2022-05-26 at 11 51 15

  1. execute_lambda fixture is responsible for deploying infrastructure and run our Lambda functions. They yield back their ARNs, execution time, etc., that can be used by helper functions, tests themselves, and maybe other fixtures
  2. helpers.get_logs functions fetch logs from CloudWatch Logs
  3. Tests follow GIVEN/WHEN/THEN structure as other parts of the project

Out of scope

  1. Automated tests run on PR opened/comments/labels
  2. Extensive set of tests

Potential challenges

Multiple Lambda Layers creation

By using pytest xdist plugin we can easily parallelise tests per group and create infrastructure and run tests in parallel. This leads to Powertools lambda layer being created 3 times, which put a pressure on CPU/RAM/IOPS unnecessarily. We should optimise the solution to create layer only once and then run parallelised tests with reference to this layer

CDK owned S3 bucket

As a CDK prerequisite, we bootstrap account for CDK usage by issuing cdk boostrap. Since s3 created by CDK doesn't have a lifecycle policy to remove old artefacts, we need to customize the default template used by CDK bootstrap command, and attach it to the feature readme file with good description how to use it.

Dependencies and Integrations

CDK

AWS CDK is responsible for synthesizing provided code into a CloudFormation stack, not deployment. We will use AWS SDK to deploy the generated CloudFormation stack instead.

During evaluation (see: Alternative solutions section), this approach combined the best compromise in ensuring a good deployment speed, infrastructure code readability and maintainability.

Helper functions

Helper functions will be testing utilities to integrate with necessary AWS services tests need, hiding unnecessary complexity.

Examples

  1. Fetch structured logs from Amazon CloudWatch Logs
  2. Fetch newly created metrics in Amazon CloudWatch Metrics
  3. Fetch newly emitted traces in AWS X-Ray

Alternative solutions

  1. Use CDK CLI to deploy infrastructure directly, not custom code to synthesise cdk code, then deploy assets and run AWS CloudFormation deployment - dropped to avoid running CLI from subprocess (more
    latency added) + to avoid additional node dependencies

  2. Write AWS CodeBuild pipeline on AWS Account side that would run tests stored somewhere else outside of the project. No configuration and tests exposed in powertools repo - dropped due to initial assumption that we want end to end tests to be part of the project/increase visibility/allow contributors to run those tests during development phase on their own

  3. Instead of using CloudFormation with multiple lambdas deployed I also considered using hot swap mechanism - either via CDK or direct call. Based on latency measured CloudFormation seems the fastest option. Attaching my findings.

Additional material

Acknowledgment

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions