diff --git a/samples/oci-monitoring-metrics-to-datadog-python/.gitignore b/samples/oci-monitoring-metrics-to-datadog-python/.gitignore new file mode 100644 index 0000000..49ed0d0 --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/.gitignore @@ -0,0 +1,5 @@ +.DS_Store +.project +.classpath +.settings +target diff --git a/samples/oci-monitoring-metrics-to-datadog-python/LICENSE.txt b/samples/oci-monitoring-metrics-to-datadog-python/LICENSE.txt new file mode 100644 index 0000000..b1a715f --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/LICENSE.txt @@ -0,0 +1,28 @@ +Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. + +The Universal Permissive License (UPL), Version 1.0 + +Subject to the condition set forth below, permission is hereby granted to any person obtaining a copy of this +software, associated documentation and/or data (collectively the "Software"), free of charge and under any +and all copyright rights in the Software, and any and all patent rights owned or freely licensable by each +licensor hereunder covering either (i) the unmodified Software as contributed to or provided by such licensor, +or (ii) the Larger Works (as defined below), to deal in both + +(a) the Software, and + +(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if one is included with the +Software (each a “Larger Work” to which the Software is contributed by such licensors), without restriction, +including without limitation the rights to copy, create derivative works of, display, perform, and +distribute the Software and make, use, sell, offer for sale, import, export, have made, and have sold +the Software and the Larger Work(s), and to sublicense the foregoing rights on either these or other terms. + +This license is subject to the following condition: + +The above copyright notice and either this complete permission notice or at a minimum a reference to the +UPL must be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, +WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE +OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/samples/oci-monitoring-metrics-to-datadog-python/README.md b/samples/oci-monitoring-metrics-to-datadog-python/README.md new file mode 100644 index 0000000..f87d08f --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/README.md @@ -0,0 +1,232 @@ +# Exporting OCI Monitoring Service Metrics to Datadog + +--- + +## Overview + +Let's take a look at bringing Oracle Cloud Infrastructure (OCI)’s rich Metrics resources over to +Datadog to accomplish common goals such DevOps monitoring, application performance monitoring, and so on. +Datadog’s API exposes some unique features. Their API allows you to characterize each metric using tags, +which is essential to aggregating and correlating data as needed for its monitoring, reporting, dashboards, and so on. + +Please see the +[companion blog](https://blogs.oracle.com/cloud-infrastructure/post/exporting-oci-monitoring-service-metrics-to-datadog) for more details. + +### Prerequisites + +If you’re new to Functions, get familiar by running through +the [Quick Start guide on OCI Functions](http://docs.oracle.com/en-us/iaas/Content/Functions/Tasks/functionsquickstartguidestop.htm) before proceeding. + +--- +## Solution Architecture + +![](images/architecture.png) + +Here is the basic architecture and flow of data from beginning to end: + +* OCI services emit metric data which is captured by the Monitoring service. +* The Monitoring Service feeds metric data events to a Service Connector. +* The Service Connector invokes a Function which transforms the metric data payload to Datadog format and posts the transformed payload to the Datadog REST API. +* Datadog ingests the metrics, building its own aggregations using the provided tagging. + +Let's drill down into the OCI Services involved. + +--- +## Monitoring Service + + The [Monitoring Service](https://docs.oracle.com/en-us/iaas/Content/Monitoring/Concepts/monitoringoverview.htm) + receives timestamp-value pairs (aka metric data points) which also carry contextual +dimensions and metadata about the services or applications that emitted them. + +--- +## Service Connector Hub + +The stream of Metric data is event-driven and must be handled on-demand and at scale. The +[Service Connector Hub](https://docs.oracle.com/en-us/iaas/Content/service-connector-hub/overview.htm) does +exactly that. See [Service Connector Hub documentation](https://docs.oracle.com/en-us/iaas/Content/service-connector-hub/overview.htm) for details. + +--- +## Functions Service + +I need to transform between the raw metrics formats and some way to make the Datadog API calls. The +[OCI Functions Service](http://docs.oracle.com/en-us/iaas/Content/Functions/Concepts/functionsoverview.htm) is a +natural fit for the task. Functions integrate nicely with Service Connector Hub as as a target and can scale up +depending on the demand. That lets me focus on writing the logic needed without needing to address how to +deploy and scale it. + +--- +## Mapping From OCI to DataDog Formats + +A key requirement of course is the mapping of OCI to Datadog format. Let's compare the OCI and Datadog +message payload formats, what the mapping needs to accomplish, and see what the resulting transformed message +looks like. + +Example OCI Metrics Payload: + + { + "namespace": "oci_vcn", + "resourceGroup": null, + "compartmentId": "ocid1.compartment.oc1...", + "name": "VnicFromNetworkBytes", + "dimensions": { + "resourceId": "ocid1.vnic.oc1.phx..." + }, + "metadata": { + "displayName": "Bytes from Network", + "unit": "bytes" + }, + "datapoints": [ + { + "timestamp": 1652196912000, + "value": 5780.0, + "count": 1 + } + ] + } + +Example DataDog Metrics Payload: + + { + "series": [ + { + "metric": "system.load.1", + "type": 0, + "points": [ + { + "timestamp": 1636629071, + 'value": 1.1 + } + ], + "tags": [ + "test:ExampleSubmitmetricsreturnsPayloadacceptedresponse" + ] + } + ] + } + +Mapping Behavior: + + { + "series": [ + { + "metric": "{re-characterized OCI namespace and metric name values}", + "type": {mapped_type_enum}, + "points": [ + { + "timestamp": {datapoint.timestamp}, + "value": {datapoint.value} + } + ], + "tags": [ + "{metrics tag key1:oci payload value}", + "{metrics tag key2:oci payload_value}" + ] + } + ] + } + +Resulting Output: + + { + "series": [ + { + "metric": "oci.vcn.vnic.from.network.bytes", + "type": 0, + "points": [ + { + "timestamp": 1652196912, + "value": 5780.0 + } + ], + "tags": [ + "name:VnicFromNetworkBytes", + "unit:bytes", + "namespace:oci_vcn", + "displayName:Bytes from Network" + ] + } + ] + } + +--- +## Policy Setup + +You will need +this [IAM policy](https://docs.oracle.com/en-us/iaas/Content/Functions/Tasks/functionscreatingpolicies.htm#Create_Policies_to_Control_Access_to_Network_and_FunctionRelated_Resources) +to authorize the Service Connector to invoke your Function. + + allow any-user to use fn-function in compartment id ocid1.compartment.oc1... where all {request.principal.type=’serviceconnector’, request.principal.compartment.id=’ocid1.compartment.oc1...’} + +--- +## Service Connector Setup + +Now let’s set up a simple service connector instance that takes Monitoring sources and passes them to our Function. + +Because your Function requires a VCN, you can use that VCN as the metric source to test against. Let's test +with the `oci_vcn` Monitoring namespace because it will quickly generate a lot of useful events. + +Select Monitoring as the source and the Function as the target. Configure your source as the +compartment where the VCN resides and select the Monitoring namespace (`oci_vcn`) that you want to +pick up. Select your Application and the Function within it as the target. + +
+ +[](image.png) + +--- +## View Metrics In DataDog + +When you have the Service Connector configured, metrics appear in Datadog's Metrics Explorer and notebooks +after a few minutes. The following images show the Metrics Explorer and Notebook user interfaces in +Datadog. Your VCN metrics are displayed. + + +[](image.png) + +
+ +[](image.png) + +--- +## Function Environment + +Here are the supported Function parameters: + +| Environment Variable | Default | Purpose | +| ------------- |:-------------:| :----- | +| DATADOG_METRICS_API_ENDPOINT | not-configured | REST API endpoint for reaching DataDog ([see docs](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics))| +| DATADOG_API_TOKEN | not-configured | API license token obtained from DataDog | +| METRICS_TAG_KEYS | name, namespace, displayName, resourceDisplayName, unit | OCI Metric Dimensions and metadata to convert to DataDog Metric Tags | +| LOGGING_LEVEL | INFO | Controls function logging outputs. Choices: INFO, WARN, CRITICAL, ERROR, DEBUG | +| ENABLE_TRACING | False | Enables complete exception stack trace logging | +| FORWARD_TO_DATADOG | True | Determines whether messages are forwarded to DataDog | + +--- +## Conclusion + +You now have a low-maintenance, serverless function that can send raw metrics over to DataDog in +near-real time. I encourage you to experiment with the dimensions and metadata tag mappings +to see which combination works best for your use case. + +For more information, see the following resources: + +- [DataDog Metrics API Reference](https://docs.datadoghq.com/api/latest/metrics/) +- [DataDog Metrics API / Submit Metrics API contract](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) + +--- +## **OCI** Related Workshops + +LiveLabs is the place to explore Oracle's products and services using workshops designed to +enhance your experience building and deploying applications on the Cloud and On-Premises. +ur library of workshops cover everything from how to provision the world's first autonomous +database to setting up a webserver on our world class OCI Generation 2 infrastructure, +machine learning and much more. Use your existing Oracle Cloud account, +a [Free Tier](https://www.oracle.com/cloud/free/) account or a LiveLabs Cloud Account to build, test, +and deploy applications on Oracle's Cloud. + +Visit [LiveLabs](http://bit.ly/golivelabs) now to get started. Workshops are added weekly, please visit frequently for new content. + +--- +## License +Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. +Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl. diff --git a/samples/oci-monitoring-metrics-to-datadog-python/func.py b/samples/oci-monitoring-metrics-to-datadog-python/func.py new file mode 100644 index 0000000..b0055a6 --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/func.py @@ -0,0 +1,319 @@ +# +# oci-monitoring-metrics-to-datadog version 1.0. +# +# Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. +# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl. + +import io +import json +import logging +import os +import re +import requests +from fdk import response +from datetime import datetime + +""" +This sample OCI Function maps OCI Monitoring Service Metrics to the DataDog +REST API 'submit-metrics' contract found here: + +https://docs.datadoghq.com/api/latest/metrics/#submit-metrics + +""" + +# Use OCI Application or Function configurations to override these environment variable defaults. + +api_endpoint = os.getenv('DATADOG_METRICS_API_ENDPOINT', 'not-configured') +api_key = os.getenv('DATADOG_API_KEY', 'not-configured') +is_forwarding = eval(os.getenv('FORWARD_TO_DATADOG', "True")) +metric_tag_keys = os.getenv('METRICS_TAG_KEYS', 'name, namespace, displayName, resourceDisplayName, unit') +metric_tag_set = set() + +# Set all registered loggers to the configured log_level + +logging_level = os.getenv('LOGGING_LEVEL', 'INFO') +loggers = [logging.getLogger()] + [logging.getLogger(name) for name in logging.root.manager.loggerDict] +[logger.setLevel(logging.getLevelName(logging_level)) for logger in loggers] + +# Exception stack trace logging + +is_tracing = eval(os.getenv('ENABLE_TRACING', "False")) + +# Constants + +TEN_MINUTES_SEC = 10 * 60 +ONE_HOUR_SEC = 60 * 60 + +# Functions + +def handler(ctx, data: io.BytesIO = None): + """ + OCI Function Entry Point + :param ctx: InvokeContext + :param data: data payload + :return: plain text response indicating success or error + """ + + preamble = " {} / event count = {} / logging level = {} / forwarding to DataDog = {}" + + try: + metrics_list = json.loads(data.getvalue()) + logging.getLogger().info(preamble.format(ctx.FnName(), len(metrics_list), logging_level, is_forwarding)) + logging.getLogger().debug(metrics_list) + converted_event_list = handle_metric_events(event_list=metrics_list) + send_to_datadog(event_list=converted_event_list) + + except (Exception, ValueError) as ex: + logging.getLogger().error('error handling logging payload: {}'.format(str(ex))) + if is_tracing: + logging.getLogger().error(ex) + + +def handle_metric_events(event_list): + """ + :param event_list: the list of metric formatted log records. + :return: the list of DataDog formatted log records + """ + + result_list = [] + for event in event_list: + single_result = transform_metric_to_datadog_format(log_record=event) + result_list.append(single_result) + logging.getLogger().debug(single_result) + + return result_list + + +def transform_metric_to_datadog_format(log_record: dict): + """ + Transform metrics to DataDog format. + See: https://github.com/metrics/spec/blob/v1.0/json-format.md + :param log_record: metric log record + :return: DataDog formatted log record + """ + + series = [{ + 'metric': get_metric_name(log_record), + 'type' : get_metric_type(log_record), + 'points' : get_metric_points(log_record), + 'tags' : get_metric_tags(log_record), + }] + + result = { + 'series' : series + } + return result + + +def get_metric_name(log_record: dict): + """ + Assembles a metric name that appears to follow DataDog conventions. + :param log_record: + :return: + """ + + elements = get_dictionary_value(log_record, 'namespace').split('_') + elements += camel_case_split(get_dictionary_value(log_record, 'name')) + elements = [element.lower() for element in elements] + return '.'.join(elements) + + +def camel_case_split(str): + """ + :param str: + :return: Splits camel case string to individual strings + """ + + return re.findall(r'[A-Z](?:[a-z]+|[A-Z]*(?=[A-Z]|$))', str) + + +def get_metric_type(log_record: dict): + """ + :param log_record: + :return: The type of metric. The available types are 0 (unspecified), 1 (count), 2 (rate), and 3 (gauge). + Allowed enum values: 0,1,2,3 + """ + + return 0 + + +def get_now_timestamp(): + return datetime.now().timestamp() + + +def adjust_metric_timestamp(timestamp_ms): + """ + DataDog Timestamps should be in POSIX time in seconds, and cannot be more than ten + minutes in the future or more than one hour in the past. OCI Timestamps are POSIX + in milliseconds, therefore a conversion is required. + + See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics + :param oci_timestamp: + :return: + """ + + # positive skew is expected + timestamp_sec = int(timestamp_ms / 1000) + delta_sec = get_now_timestamp() - timestamp_sec + + if (delta_sec > 0 and delta_sec > ONE_HOUR_SEC): + logging.getLogger().warning('timestamp {} too far in the past per DataDog'.format(timestamp_ms)) + + if (delta_sec < 0 and abs(delta_sec) > TEN_MINUTES_SEC): + logging.getLogger().warning('timestamp {} too far in the future per DataDog'.format(timestamp_ms)) + + return timestamp_sec + + +def get_metric_points(log_record: dict): + """ + :param log_record: + :return: an array of arrays where each array is a datapoint scalar pair + """ + + result = [] + + datapoints = get_dictionary_value(dictionary=log_record, target_key='datapoints') + for point in datapoints: + dd_point = {'timestamp': adjust_metric_timestamp(point.get('timestamp')), + 'value': point.get('value')} + + result.append(dd_point) + + return result + + +def get_metric_tags(log_record: dict): + """ + Assembles tags from selected metric attributes. + See https://docs.datadoghq.com/getting_started/tagging/ + :param log_record: the log record to scan + :return: string of comma-separated, key:value pairs matching DataDog tag format + """ + + result = [] + + for tag in get_metric_tag_set(): + value = get_dictionary_value(dictionary=log_record, target_key=tag) + if value is None: + continue + + if isinstance(value, str) and ':' in value: + logging.getLogger().warning('tag contains a \':\' / ignoring {} ({})'.format(key, value)) + continue + + tag = '{}:{}'.format(tag, value) + result.append(tag) + + return result + + +def get_metric_tag_set(): + """ + :return: the set metric payload keys that we would like to have converted to tags. + """ + + global metric_tag_set + + if len(metric_tag_set) == 0 and metric_tag_keys: + split_and_stripped_tags = [x.strip() for x in metric_tag_keys.split(',')] + metric_tag_set.update(split_and_stripped_tags) + logging.getLogger().debug("tag key set / {} ".format (metric_tag_set)) + + return metric_tag_set + + +def send_to_datadog (event_list): + """ + Sends each transformed event to DataDog Endpoint. + :param event_list: list of events in DataDog format + :return: None + """ + + if is_forwarding is False: + logging.getLogger().debug("DataDog forwarding is disabled - nothing sent") + return + + if 'v2' not in api_endpoint: + raise RuntimeError('Requires API endpoint version "v2": "{}"'.format(api_endpoint)) + + # creating a session and adapter to avoid recreating + # a new connection pool between each POST call + + try: + session = requests.Session() + adapter = requests.adapters.HTTPAdapter(pool_connections=10, pool_maxsize=10) + session.mount('https://', adapter) + + for event in event_list: + api_headers = {'Content-type': 'application/json', 'DD-API-KEY': api_key} + logging.getLogger().debug("json to datadog: {}".format (json.dumps(event))) + response = session.post(api_endpoint, data=json.dumps(event), headers=api_headers) + + if response.status_code != 202: + raise Exception ('error {} sending to DataDog: {}'.format(response.status_code, response.reason)) + + finally: + session.close() + + +def get_dictionary_value(dictionary: dict, target_key: str): + """ + Recursive method to find value within a dictionary which may also have nested lists / dictionaries. + :param dictionary: the dictionary to scan + :param target_key: the key we are looking for + :return: If a target_key exists multiple times in the dictionary, the first one found will be returned. + """ + + if dictionary is None: + raise Exception('dictionary None for key'.format(target_key)) + + target_value = dictionary.get(target_key) + if target_value: + return target_value + + for key, value in dictionary.items(): + if isinstance(value, dict): + target_value = get_dictionary_value(dictionary=value, target_key=target_key) + if target_value: + return target_value + + elif isinstance(value, list): + for entry in value: + if isinstance(entry, dict): + target_value = get_dictionary_value(dictionary=entry, target_key=target_key) + if target_value: + return target_value + + +def local_test_mode(filename): + """ + This routine reads a local json metrics file, converting the contents to DataDog format. + :param filename: cloud events json file exported from OCI Logging UI or CLI. + :return: None + """ + + logging.getLogger().info("local testing started") + + with open(filename, 'r') as f: + transformed_results = list() + + for line in f: + event = json.loads(line) + logging.getLogger().debug(json.dumps(event, indent=4)) + transformed_result = transform_metric_to_datadog_format(event) + transformed_results.append(transformed_result) + + logging.getLogger().debug(json.dumps(transformed_results, indent=4)) + send_to_datadog(event_list=transformed_results) + + logging.getLogger().info("local testing completed") + + +""" +Local Debugging +""" + +if __name__ == "__main__": + local_test_mode('oci-metrics-test-file.json') + diff --git a/samples/oci-monitoring-metrics-to-datadog-python/func.yaml b/samples/oci-monitoring-metrics-to-datadog-python/func.yaml new file mode 100644 index 0000000..bfccf60 --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/func.yaml @@ -0,0 +1,8 @@ +schema_version: 20180708 +name: oci-monitoring-metrics-to-datadog-python +version: 0.0.1 +runtime: python +build_image: fnproject/python:3.9-dev +run_image: fnproject/python:3.9 +entrypoint: /python/bin/fdk /function/func.py handler +memory: 256 diff --git a/samples/oci-monitoring-metrics-to-datadog-python/images/architecture.png b/samples/oci-monitoring-metrics-to-datadog-python/images/architecture.png new file mode 100644 index 0000000..9ab97a1 Binary files /dev/null and b/samples/oci-monitoring-metrics-to-datadog-python/images/architecture.png differ diff --git a/samples/oci-monitoring-metrics-to-datadog-python/images/datadog1.png b/samples/oci-monitoring-metrics-to-datadog-python/images/datadog1.png new file mode 100644 index 0000000..b78ec9f Binary files /dev/null and b/samples/oci-monitoring-metrics-to-datadog-python/images/datadog1.png differ diff --git a/samples/oci-monitoring-metrics-to-datadog-python/images/datadog2.png b/samples/oci-monitoring-metrics-to-datadog-python/images/datadog2.png new file mode 100644 index 0000000..b28b836 Binary files /dev/null and b/samples/oci-monitoring-metrics-to-datadog-python/images/datadog2.png differ diff --git a/samples/oci-monitoring-metrics-to-datadog-python/images/sch-setup.png b/samples/oci-monitoring-metrics-to-datadog-python/images/sch-setup.png new file mode 100644 index 0000000..d102969 Binary files /dev/null and b/samples/oci-monitoring-metrics-to-datadog-python/images/sch-setup.png differ diff --git a/samples/oci-monitoring-metrics-to-datadog-python/requirements.txt b/samples/oci-monitoring-metrics-to-datadog-python/requirements.txt new file mode 100644 index 0000000..bd7dccf --- /dev/null +++ b/samples/oci-monitoring-metrics-to-datadog-python/requirements.txt @@ -0,0 +1,3 @@ +oci +requests +fdk