You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Grafana dashboard and installation steps (#1620)
Problem: As a user, I want to know how to easily install prometheus and grafana to visualize my NGF metrics.
Solution: Add basic installation steps for both prometheus and grafana, and provide a sample dashboard (based on the nginx-prometheus-exporter dashboard)
Copy file name to clipboardExpand all lines: site/content/how-to/monitoring/prometheus.md
+84-34Lines changed: 84 additions & 34 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: "Prometheus Metrics"
3
-
description: "Learn how to monitor your NGINX Gateway Fabric effectively. This guide provides easy steps for configuring and understanding key performance metrics using Prometheus."
3
+
description: "This document describes how to monitor NGINX Gateway Fabric using Prometheus and Grafana. It explains installation and configuration, as well as what metrics are available."
4
4
weight: 100
5
5
toc: true
6
6
docs: "DOCS-1418"
@@ -11,16 +11,96 @@ docs: "DOCS-1418"
11
11
## Overview
12
12
13
13
14
-
NGINX Gateway Fabric metrics are displayed in [Prometheus](https://prometheus.io/) format, simplifying monitoring. You can track NGINX and controller-runtime metrics through a metrics server orchestrated by the controller-runtime package. These metrics are enabled by default and can be accessed on HTTP port `9113`.
15
-
14
+
NGINX Gateway Fabric metrics are displayed in [Prometheus](https://prometheus.io/) format. These metrics are served through a metrics server orchestrated by the controller-runtime package on HTTP port `9113`. When installed, Prometheus automatically scrapes this port and collects metrics. [Grafana](https://grafana.com/) can be used for rich visualization of these metrics.
16
15
17
16
{{<call-out "important" "Security note for metrics">}}
18
17
Metrics are served over HTTP by default. Enabling HTTPS will secure the metrics endpoint with a self-signed certificate. When using HTTPS, adjust the Prometheus Pod scrape settings by adding the `insecure_skip_verify` flag to handle the self-signed certificate. For further details, refer to the [Prometheus documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config).
19
18
{{</call-out>}}
20
19
20
+
## Installing Prometheus and Grafana
21
+
22
+
{{< note >}}These installations are for demonstration purposes and have not been tuned for a production environment.{{< /note >}}
In the Grafana UI menu, go to `Connections` then `Data sources`. Add your Prometheus service (`http://prometheus-server.monitoring.svc`) as a data source.
66
+
67
+
Download the following sample dashboard and Import as a new Dashboard in the Grafana UI.
NGINX Gateway Fabric provides a variety of metrics for monitoring and analyzing performance. These metrics are categorized as follows:
74
+
75
+
### NGINX/NGINX Plus metrics
76
+
77
+
NGINX metrics cover specific NGINX operations such as the total number of accepted client connections. For a complete list of available NGINX/NGINX Plus metrics, refer to the [NGINX Prometheus Exporter developer docs](https://github.com/nginxinc/nginx-prometheus-exporter#exported-metrics).
78
+
79
+
These metrics use the `nginx_gateway_fabric` namespace and include the `class` label, indicating the NGINX Gateway class. For example, `nginx_gateway_fabric_connections_accepted{class="nginx"}`.
-`nginx_stale_config`: Indicates if NGINX Gateway Fabric couldn't update NGINX with the latest configuration, resulting in a stale version.
88
+
-`nginx_last_reload_milliseconds`: Time in milliseconds for NGINX reloads.
89
+
-`event_batch_processing_milliseconds`: Time in milliseconds to process batches of Kubernetes events.
90
+
91
+
All these metrics are under the `nginx_gateway_fabric` namespace and include a `class` label set to the Gateway class of NGINX Gateway Fabric. For example, `nginx_gateway_fabric_nginx_reloads_total{class="nginx"}`.
92
+
93
+
### Controller-runtime metrics
94
+
95
+
Provided by the [controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) library, these metrics include:
96
+
97
+
- General resource usage like CPU and memory.
98
+
- Go runtime metrics such as the number of Go routines, garbage collection duration, and Go version.
99
+
- Controller-specific metrics, including reconciliation errors per controller, length of the reconcile queue, and reconciliation latency.
100
+
21
101
## How to change the default metrics configuration
22
102
23
-
Configuring NGINX Gateway Fabric for monitoring is straightforward. You can change metric settings using Helm or Kubernetes manifests, depending on your setup.
103
+
You can configure monitoring metrics for NGINX Gateway Fabric using Helm or Manifests.
24
104
25
105
### Using Helm
26
106
@@ -85,33 +165,3 @@ For enhanced security with HTTPS:
85
165
prometheus.io/scheme: "https"
86
166
<...>
87
167
```
88
-
89
-
## Available metrics in NGINX Gateway Fabric
90
-
91
-
NGINX Gateway Fabric provides a variety of metrics to assist in monitoring and analyzing performance. These metrics are categorized as follows:
92
-
93
-
### NGINX/NGINX Plus metrics
94
-
95
-
NGINX metrics, essential for monitoring specific NGINX operations, include details like the total number of accepted client connections. For a complete list of available NGINX/NGINX Plus metrics, refer to the [NGINX Prometheus Exporter developer docs](https://github.com/nginxinc/nginx-prometheus-exporter#exported-metrics).
96
-
97
-
These metrics use the `nginx_gateway_fabric` namespace and include the `class` label, indicating the NGINX Gateway class. For example, `nginx_gateway_fabric_connections_accepted{class="nginx"}`.
98
-
99
-
### NGINX Gateway Fabric metrics
100
-
101
-
Metrics specific to the NGINX Gateway Fabric include:
- `nginx_stale_config`: Indicates if NGINX Gateway Fabric couldn't update NGINX with the latest configuration, resulting in a stale version.
106
-
- `nginx_last_reload_milliseconds`: Time in milliseconds for NGINX reloads.
107
-
- `event_batch_processing_milliseconds`: Time in milliseconds to process batches of Kubernetes events.
108
-
109
-
All these metrics are under the `nginx_gateway_fabric` namespace and include a `class` label set to the Gateway class of NGINX Gateway Fabric. For example, `nginx_gateway_fabric_nginx_reloads_total{class="nginx"}`.
110
-
111
-
### Controller-runtime metrics
112
-
113
-
Provided by the [controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) library, these metrics cover a range of aspects:
114
-
115
-
- General resource usage like CPU and memory.
116
-
- Go runtime metrics such as the number of Go routines, garbage collection duration, and Go version.
117
-
- Controller-specific metrics, including reconciliation errors per controller, length of the reconcile queue, and reconciliation latency.
0 commit comments