-
Notifications
You must be signed in to change notification settings - Fork 429
feat(data_masking): add new sensitive data masking utility #2197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 28 commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
2d5bfcc
Added logic for sensitive data masking and unit tests
seshubaws b2e4d10
Merge branch 'develop' into develop
leandrodamascena 7d65c7d
Merge branch 'develop' into develop
leandrodamascena 4b0d0c0
Restructured into smaller files, fixed linting errors
seshubaws b34a1ca
Fix linting errors
seshubaws 092c165
Merge branch 'awslabs:develop' into develop
seshubaws 4ec6603
Merge branch 'develop' of https://github.com/seshubaws/aws-lambda-pow…
seshubaws 7b13c6f
Merge branch 'awslabs:develop' into develop
seshubaws c6ec149
Lint tests
seshubaws 8bc8c02
Merge branch 'develop' of https://github.com/seshubaws/aws-lambda-pow…
seshubaws 21759b5
Fix mypy errors
seshubaws 6a2e98a
Fixing tests
seshubaws d1b6690
Merge branch 'develop' into develop
leandrodamascena d39d956
mypy fixes
seshubaws 2157815
Merge branch 'develop' of https://github.com/seshubaws/aws-lambda-pow…
seshubaws 97c5b85
Fixed passing in context for aws encryption sdk provider
seshubaws f722e70
Use d pytest library for unit testing
seshubaws d5f014b
Raise error for unimplemented dm provider
seshubaws bef87e0
Fix context for encryption sdk provider
seshubaws 65eb7e3
Add type annotation to context
seshubaws fb3fbc6
Fix context
seshubaws ec9f49f
Fixing tests
seshubaws 98ba4d9
Added markdown-lint to pre-commit yaml
seshubaws f48c2f5
Merging from develop + creating extra dependencies
leandrodamascena 3ad5046
Merging from develop + creating extra dependencies
leandrodamascena 5b7e256
Merging from develop + creating extra dependencies
leandrodamascena b9053d9
Revisions per comments
seshubaws 0193ee6
Added performance benchmarking tests
seshubaws 22f0b46
Update aws_lambda_powertools/utilities/data_masking/providers/aws_enc…
seshubaws ece4643
Update aws_lambda_powertools/utilities/data_masking/providers/aws_enc…
seshubaws 8299039
Removed args and ItsDangerous and commented on tests
seshubaws c36deb5
Merge branch 'develop' of https://github.com/seshubaws/aws-lambda-pow…
seshubaws 5423f7f
Merge branch 'develop' of https://github.com/aws-powertools/powertool…
seshubaws 27eca17
Added functional tests and put input data in separate file
seshubaws 876f4f7
Merge branch 'develop' into develop
heitorlessa fe37c50
Applied patch to update lock to latest range deps
seshubaws 2eab50b
Made unit tests more legible, removed parameterization
seshubaws 57a5a3a
Adding E2E tests (wip)
seshubaws 8aabc7f
Added data_masking constants, made into BaseProvider and added types
seshubaws bbeaa4e
Add check for encryption_context in Encryption SDK
seshubaws 5b794f7
Fixing enc_context e2e tests
seshubaws 2955c9c
Added test to encrypt&decrypt from logs in e2e tests
seshubaws b15b866
Added custom exception for enc_context mismatch, used pytest fixtures…
seshubaws ee3dddc
Added some docstrings and typing
seshubaws a79f3df
Added test for using DataMasking in a lambda handler, wip due to inco…
seshubaws 7483d46
Merge remote-tracking branch 'upstream/develop' into develop
seshubaws 7883a48
Revised singleton class to allow for one instance per different confi…
seshubaws 7127c9c
Removed itsdangerous dependencies
seshubaws 01885a5
Added serializer for aws enc sdk
seshubaws 5b83b66
chore: fix merge conflict, remove itsdangerous leftovers (#2)
heitorlessa 371ea05
Building data within func tests instead of using setup.py
seshubaws b3d123d
Updated json serializer for aws encrypt sdk to return original data type
seshubaws c0c3f2f
Added ability for user input custom json de/serializer in base class
seshubaws c5233af
Apply patch for use latest manylinux
seshubaws bcc735a
Added KMS permissions to lambda handler for e2e tests
seshubaws eee4c86
Clarified variable names and documented logic (wip still need to disc…
seshubaws ab15acd
Polished var names, error strings, documentation, etc
seshubaws 73ae382
Added a stack for load testing data masking and added artillery confi…
seshubaws 39a835e
Added 1024MB funcs and load tested with them
seshubaws da24bcf
Removed orchestrator function and test since same test in E2E
seshubaws 970df5c
Removed singleton class from code and load and e2e tests
seshubaws 487dc0e
Merge from upstream
seshubaws 069aa94
Fix linting errors
seshubaws ee325f4
Fix mypy errors
seshubaws 49afeed
Modified data masking test names
seshubaws 73df808
Fix dummy KMS key for correct parsing
seshubaws 1ea59f0
Bumping cryptography library
leandrodamascena ba534ed
Setting default region to avoid HTTP connection
leandrodamascena ceb6131
Removing user agent tracking
leandrodamascena bf0e4ed
Reverting
leandrodamascena 6a064b1
Creating a specific provider instead a client to avoid any http call …
leandrodamascena c01ea35
Merge branch 'develop' into develop
leandrodamascena File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
import json | ||
from typing import Union | ||
|
||
from aws_lambda_powertools.utilities.data_masking.provider import Provider | ||
|
||
|
||
class DataMasking: | ||
def __init__(self, provider=None): | ||
if provider is None: | ||
self.provider = Provider() | ||
else: | ||
self.provider = provider | ||
|
||
def encrypt(self, data, fields=None, **kwargs): | ||
return self._apply_action(data, fields, self.provider.encrypt, **kwargs) | ||
|
||
def decrypt(self, data, fields=None, **kwargs): | ||
return self._apply_action(data, fields, self.provider.decrypt, **kwargs) | ||
|
||
def mask(self, data, fields=None, **kwargs): | ||
return self._apply_action(data, fields, self.provider.mask, **kwargs) | ||
|
||
def _apply_action(self, data, fields, action, *args, **kwargs): | ||
if fields is not None: | ||
return self._apply_action_to_fields(data, fields, action, *args, **kwargs) | ||
else: | ||
return action(data, *args, **kwargs) | ||
|
||
def _apply_action_to_fields(self, data: Union[dict, str], fields, action, *args, **kwargs) -> str: | ||
if fields is None: | ||
raise ValueError("No fields specified.") | ||
|
||
if isinstance(data, str): | ||
# Parse JSON string as dictionary | ||
my_dict_parsed = json.loads(data) | ||
elif isinstance(data, dict): | ||
# Turn into json string so everything has quotes around it | ||
my_dict_parsed = json.dumps(data) | ||
# Turn back into dict so can parse it | ||
my_dict_parsed = json.loads(my_dict_parsed) | ||
else: | ||
raise TypeError( | ||
"Unsupported data type. The 'data' parameter must be a dictionary or a JSON string " | ||
"representation of a dictionary." | ||
) | ||
|
||
for field in fields: | ||
if not isinstance(field, str): | ||
field = json.dumps(field) | ||
keys = field.split(".") | ||
|
||
curr_dict = my_dict_parsed | ||
for key in keys[:-1]: | ||
curr_dict = curr_dict[key] | ||
valtochange = curr_dict[(keys[-1])] | ||
curr_dict[keys[-1]] = action(valtochange, *args, **kwargs) | ||
|
||
return my_dict_parsed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
from abc import abstractmethod | ||
from collections.abc import Iterable | ||
|
||
from aws_lambda_powertools.shared.constants import DATA_MASKING_STRING | ||
|
||
|
||
class Provider: | ||
""" | ||
When you try to create an instance of a subclass that does not implement the encrypt method, | ||
you will get a NotImplementedError with a message that says the method is not implemented: | ||
""" | ||
|
||
@abstractmethod | ||
def encrypt(self, data): | ||
raise NotImplementedError("Subclasses must implement encrypt()") | ||
|
||
@abstractmethod | ||
def decrypt(self, data): | ||
raise NotImplementedError("Subclasses must implement decrypt()") | ||
|
||
def mask(self, data): | ||
if isinstance(data, (str, dict, bytes)): | ||
return DATA_MASKING_STRING | ||
elif isinstance(data, Iterable): | ||
return type(data)([DATA_MASKING_STRING] * len(data)) | ||
return DATA_MASKING_STRING |
Empty file.
56 changes: 56 additions & 0 deletions
56
aws_lambda_powertools/utilities/data_masking/providers/aws_encryption_sdk.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
import base64 | ||
from typing import Any, Dict, List, Optional, Union | ||
|
||
import botocore | ||
from aws_encryption_sdk import ( | ||
CachingCryptoMaterialsManager, | ||
EncryptionSDKClient, | ||
LocalCryptoMaterialsCache, | ||
StrictAwsKmsMasterKeyProvider, | ||
) | ||
|
||
from aws_lambda_powertools.utilities.data_masking.provider import Provider | ||
|
||
seshubaws marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
class SingletonMeta(type): | ||
"""Metaclass to cache class instances to optimize encryption""" | ||
|
||
_instances: Dict["AwsEncryptionSdkProvider", Any] = {} | ||
|
||
def __call__(cls, *args, **kwargs): | ||
if cls not in cls._instances: | ||
instance = super().__call__(*args, **kwargs) | ||
cls._instances[cls] = instance | ||
return cls._instances[cls] | ||
|
||
|
||
CACHE_CAPACITY: int = 100 | ||
MAX_ENTRY_AGE_SECONDS: float = 300.0 | ||
MAX_MESSAGES: int = 200 | ||
# NOTE: You can also set max messages/bytes per data key | ||
|
||
|
||
class AwsEncryptionSdkProvider(Provider, metaclass=SingletonMeta): | ||
seshubaws marked this conversation as resolved.
Show resolved
Hide resolved
|
||
cache = LocalCryptoMaterialsCache(CACHE_CAPACITY) | ||
session = botocore.session.Session() | ||
|
||
seshubaws marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def __init__(self, keys: List[str], client: Optional[EncryptionSDKClient] = None) -> None: | ||
self.client = client or EncryptionSDKClient() | ||
self.keys = keys | ||
self.key_provider = StrictAwsKmsMasterKeyProvider(key_ids=self.keys, botocore_session=self.session) | ||
self.cache_cmm = CachingCryptoMaterialsManager( | ||
master_key_provider=self.key_provider, | ||
cache=self.cache, | ||
max_age=MAX_ENTRY_AGE_SECONDS, | ||
max_messages_encrypted=MAX_MESSAGES, | ||
) | ||
|
||
def encrypt(self, data: Union[bytes, str], *args, **kwargs) -> str: | ||
ciphertext, _ = self.client.encrypt(source=data, materials_manager=self.cache_cmm, *args, **kwargs) | ||
ciphertext = base64.b64encode(ciphertext).decode() | ||
return ciphertext | ||
|
||
def decrypt(self, data: str, *args, **kwargs) -> bytes: | ||
ciphertext_decoded = base64.b64decode(data) | ||
ciphertext, _ = self.client.decrypt(source=ciphertext_decoded, key_provider=self.key_provider, *args, **kwargs) | ||
return ciphertext |
53 changes: 53 additions & 0 deletions
53
aws_lambda_powertools/utilities/data_masking/providers/itsdangerous.py
seshubaws marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
from itsdangerous.url_safe import URLSafeSerializer | ||
|
||
from aws_lambda_powertools.utilities.data_masking.provider import Provider | ||
|
||
|
||
class ItsDangerousProvider(Provider): | ||
def __init__( | ||
self, | ||
keys, | ||
salt=None, | ||
serializer=None, | ||
serializer_kwargs=None, | ||
signer=None, | ||
signer_kwargs=None, | ||
fallback_signers=None, | ||
): | ||
self.keys = keys | ||
self.salt = salt | ||
self.serializer = serializer | ||
self.serializer_kwargs = serializer_kwargs | ||
self.signer = signer | ||
self.signer_kwargs = signer_kwargs | ||
self.fallback_signers = fallback_signers | ||
|
||
def encrypt(self, data): | ||
if data is None: | ||
return data | ||
|
||
serialized = URLSafeSerializer( | ||
self.keys, | ||
salt=self.salt, | ||
serializer=None, | ||
serializer_kwargs=None, | ||
signer=None, | ||
signer_kwargs=None, | ||
fallback_signers=None, | ||
) | ||
return serialized.dumps(data) | ||
|
||
def decrypt(self, data): | ||
if data is None: | ||
return data | ||
|
||
serialized = URLSafeSerializer( | ||
self.keys, | ||
salt=self.salt, | ||
serializer=None, | ||
serializer_kwargs=None, | ||
signer=None, | ||
signer_kwargs=None, | ||
fallback_signers=None, | ||
) | ||
return serialized.loads(data) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.