Description
Is this related to an existing feature request or issue?
Which AWS Lambda Powertools utility does this relate to?
Other
Summary
Build mechanism to run an end to end tests on Lambda Powertools library using real AWS Services (Lambda, DynamoDB).
Initially, tests can be run manually by maintainers on specific branch/commit_id to ensure expected feature works.
Tests should be triggered in GitHub but also maintainers/contributors should be able to run them in their local environment using their own AWS Account.
Use case
Providing mechanism to run end to end tests in real-live environment allows us to discover different class of problems we cannot find out otherwise by running unit tests or integration tests. For example, how the code base behaves during Lambda during cold and warm start, event source misconfiguration, IAM permissions, etc. It also allow us to validate integration with external services (CloudWatch Logs, X-ray) and ensure final real user experience is what we expect.
When it should be used
- Test feature from end user perspective
- Test external integration with AWS Services and applied policies this mechanism
- Test event source configurations and/or combinations
- Test whether our documented IAM permissions work as expected
Examples
- Test if structured logs generated by library is visible in AWS CloudWatch and has all necessary attributes
- Test if generated trace is visible in AWS X-Ray and has all provided metadata and annotation included
- Test if business metric generated by library is visible in CloudWatch under expected namespace and with expected value
When integration test may be more appropriate instead
Integration testing would be a better fit when we can increase confidence by covering code base -> AWS Service(s). These can give us a faster feedback loop while reducing the permutations of E2E test we might need to cover the end user perspective, permissions, etc.
Examples
- Test if pure Python function is idempotent and subsequent calls return the same value
- Test whether Feature Flags can fetch schema
- Test whether Parameters utility can fetch values from SSM, Secrets Manager, AppConfig, DynamoDB
Proposal
Overview
- Use CDK SDK (lib) to describe infrastructure: currently lambda + powertools layer
- Run tests in parallel and separate them by feature directory e.g. metrics/, tracer/
- Every feature group has a separate infrastructure deployed e.g., metrics stack, tracer stack
- Enable running them from GitHub Actions and from local machine on specified AWS Account
- Clean up all resources at the end of the test
What an E2E test would look like
More details in the
What's in a test
section
Details
Github configuration
- Integrate GitHub with AWS Account using OIDC: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
- Specify end2end tests commands in Github Actions
Test setup
Tests will follow a common directory structure to allow us to parallelize infrastructure creation and test execution for all feature groups. Tests within a feature group, say tracer/
, are run sequentially and independently from other feature groups.
Test fixtures will provide the necessary infrastructure and any relevant information tests need to succeed, e.g. Lambda Function ARN. Helper methods will also be provided to hide integration details and ease test creation.
If there are no more tests, infrastructure resources are automatically cleaned-up and results are synchronized and return to the user.
Directory Structure
tests/e2e
├── conftest.py
├── logger
│ ├── handlers
│ └── test_logger.py
├── metrics
│ ├── handlers
│ └── test_metrics.py
├── tracer
│ ├── handlers
│ └── test_tracer.py
└── utils
├── helpers.py
└── infrastructure.py
Explanation
- We keep our end to end tests under tests/e2e directory.
- We split tests by groups matching different Powertools features - in this example we have 3 groups (logger, metrics and tracer).
- Our test mechanism parallelize tests execution by looking at those groups.
utils
directory has utilities to simplify writing tests, and an infrastructure module used for deploying infrastructure
Note: In the first phase we may reuse infrastructure helper class in all test groups. If we decide we need more infrastructure configuration granularity per groups we will create sub-classes from core infra class and overwrite method responsible for describing infrastructure in CDK.
Reasoning
Keeping infrastructure creation module separate from test groups helps in reusing infrastructure along multiple tests within a feature group. It also allows us to benchmark tests and infra separately in the future. It also help contributor to write tests without expectation to dive deep into infra creation mechanism.
General Flow Diagram
graph TD
A([Start]) -->|Run e2e tests| B[Find all tests group AND parallelize execution]
B -->|group 1|C1[Deploy infrastructure]
B --> |group 2|C2[Deploy infrastructure]
C1 --> F1{Is another test available?}
F1 --> |no|G1[Destroy infrastructure]
F1 --> |yes|I1[Run test]
I1 -->|Find next test|F1
C2 --> F2{Is another test available?}
F2 --> |no|G2[Destroy infrastructure]
F2 --> |yes|I2[Run test]
I2 -->|Find next test|F2
G1 --> K[Return results]
G2 --> K
K -->L([Stop])
What's in a test
Sample test using Pytest as our test runner
execute_lambda
fixture is responsible for deploying infrastructure and run our Lambda functions. They yield back their ARNs, execution time, etc., that can be used by helper functions, tests themselves, and maybe other fixtureshelpers.get_logs
functions fetch logs from CloudWatch Logs- Tests follow
GIVEN/WHEN/THEN
structure as other parts of the project
Out of scope
- Automated tests run on PR opened/comments/labels
- Extensive set of tests
Potential challenges
Multiple Lambda Layers creation
By using pytest xdist
plugin we can easily parallelise tests per group and create infrastructure and run tests in parallel. This leads to Powertools lambda layer being created 3 times, which put a pressure on CPU/RAM/IOPS unnecessarily. We should optimise the solution to create layer only once and then run parallelised tests with reference to this layer
CDK owned S3 bucket
As a CDK prerequisite, we bootstrap account for CDK usage by issuing cdk boostrap
. Since s3 created by CDK doesn't have a lifecycle policy to remove old artefacts, we need to customize the default template used by CDK bootstrap
command, and attach it to the feature readme file with good description how to use it.
Dependencies and Integrations
CDK
AWS CDK is responsible for synthesizing provided code into a CloudFormation stack, not deployment. We will use AWS SDK to deploy the generated CloudFormation stack instead.
During evaluation (see: Alternative solutions
section), this approach combined the best compromise in ensuring a good deployment speed, infrastructure code readability and maintainability.
Helper functions
Helper functions will be testing utilities to integrate with necessary AWS services tests need, hiding unnecessary complexity.
Examples
- Fetch structured logs from Amazon CloudWatch Logs
- Fetch newly created metrics in Amazon CloudWatch Metrics
- Fetch newly emitted traces in AWS X-Ray
Alternative solutions
-
Use CDK CLI to deploy infrastructure directly, not custom code to synthesise cdk code, then deploy assets and run AWS CloudFormation deployment - dropped to avoid running CLI from subprocess (more
latency added) + to avoid additional node dependencies -
Write AWS CodeBuild pipeline on AWS Account side that would run tests stored somewhere else outside of the project. No configuration and tests exposed in powertools repo - dropped due to initial assumption that we want end to end tests to be part of the project/increase visibility/allow contributors to run those tests during development phase on their own
-
Instead of using CloudFormation with multiple lambdas deployed I also considered using hot swap mechanism - either via CDK or direct call. Based on latency measured CloudFormation seems the fastest option. Attaching my findings.
Additional material
Acknowledgment
- This feature request meets Lambda Powertools Tenets
- Should this be considered in other Lambda
Powertools languages? i.e. Java, TypeScript