diff --git a/tests/results/longevity/1.3.0/oss-cpu.png b/tests/results/longevity/1.3.0/oss-cpu.png new file mode 100644 index 0000000000..eabef67af2 Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-cpu.png differ diff --git a/tests/results/longevity/1.3.0/oss-final-reloads.png b/tests/results/longevity/1.3.0/oss-final-reloads.png new file mode 100644 index 0000000000..b14ac0a200 Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-final-reloads.png differ diff --git a/tests/results/longevity/1.3.0/oss-memory.png b/tests/results/longevity/1.3.0/oss-memory.png new file mode 100644 index 0000000000..8f9722d439 Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-memory.png differ diff --git a/tests/results/longevity/1.3.0/oss-ngf-memory.png b/tests/results/longevity/1.3.0/oss-ngf-memory.png new file mode 100644 index 0000000000..cac9ee745a Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-ngf-memory.png differ diff --git a/tests/results/longevity/1.3.0/oss-reload-time.png b/tests/results/longevity/1.3.0/oss-reload-time.png new file mode 100644 index 0000000000..234aa9527e Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-reload-time.png differ diff --git a/tests/results/longevity/1.3.0/oss-reloads.png b/tests/results/longevity/1.3.0/oss-reloads.png new file mode 100644 index 0000000000..bf42b49ddc Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-reloads.png differ diff --git a/tests/results/longevity/1.3.0/oss-stub-status.png b/tests/results/longevity/1.3.0/oss-stub-status.png new file mode 100644 index 0000000000..b945fc0881 Binary files /dev/null and b/tests/results/longevity/1.3.0/oss-stub-status.png differ diff --git a/tests/results/longevity/1.3.0/oss.md b/tests/results/longevity/1.3.0/oss.md new file mode 100644 index 0000000000..269db29331 --- /dev/null +++ b/tests/results/longevity/1.3.0/oss.md @@ -0,0 +1,106 @@ +# Results + +## Test environment + +NGINX Plus: false + +GKE Cluster: + +- Node count: 3 +- k8s version: v1.28.9-gke.1000000 +- vCPUs per node: 2 +- RAM per node: 4019180Ki +- Max pods per node: 110 +- Zone: us-central1-c +- Instance Type: e2-medium +- NGF pod name -- ngf-longevity-nginx-gateway-fabric-59576c5749-dgrwg + +## Traffic + +HTTP: + +```text +Running 5760m test @ http://cafe.example.com/coffee + 2 threads and 100 connections + Thread Stats Avg Stdev Max +/- Stdev + Latency 186.26ms 145.81ms 2.00s 78.32% + Req/Sec 299.23 199.98 1.80k 66.54% + 202451765 requests in 5760.00m, 70.37GB read + Socket errors: connect 0, read 338005, write 0, timeout 4600 +Requests/sec: 585.80 +Transfer/sec: 213.51KB +``` + +HTTPS: + +```text +Running 5760m test @ https://cafe.example.com/tea + 2 threads and 100 connections + Thread Stats Avg Stdev Max +/- Stdev + Latency 177.05ms 122.73ms 1.98s 67.91% + Req/Sec 298.12 199.66 1.83k 66.52% + 201665338 requests in 5760.00m, 69.02GB read + Socket errors: connect 0, read 332742, write 0, timeout 40 +Requests/sec: 583.52 +Transfer/sec: 209.42KB +``` + +### Logs + +No error logs in nginx-gateway + +No error logs in nginx + +### Key Metrics + +#### Containers memory + +![oss-memory.png](oss-memory.png) + +Drop in NGINX memory usage corresponds to the end of traffic generation. + +#### NGF Container Memory + +![oss-ngf-memory.png](oss-ngf-memory.png) + +### Containers CPU + +![oss-cpu.png](oss-cpu.png) + +Drop in NGINX CPU usage corresponds to the end of traffic generation. + +### NGINX metrics + +![oss-stub-status.png](oss-stub-status.png) + +Drop in request corresponds to the end of traffic generation. + + +### Reloads + +Rate of reloads - successful and errors: + +![oss-reloads.png](oss-reloads.png) + + +Reload spikes correspond to 1 hour periods of backend re-rollouts. + +No reloads finished with an error. + +Reload time distribution - counts: + +![oss-reload-time.png](oss-reload-time.png) + + +Reload related metrics at the end: + +![oss-final-reloads.png](oss-final-reloads.png) + +All successful reloads took less than 5 seconds, with most under 1 second. + +## Comparison with previous runs + +Graphs look similar to 1.2.0 results. +As https://github.com/nginxinc/nginx-gateway-fabric/issues/1112 was fixed, we no longer see the corresponding +reload spikes. +Memory usage is flat, but ~1 Mb higher than in 1.2.0. diff --git a/tests/results/longevity/1.3.0/plus-cpu.png b/tests/results/longevity/1.3.0/plus-cpu.png new file mode 100644 index 0000000000..e8b8d0cea6 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-cpu.png differ diff --git a/tests/results/longevity/1.3.0/plus-final-reloads.png b/tests/results/longevity/1.3.0/plus-final-reloads.png new file mode 100644 index 0000000000..c71c0443c5 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-final-reloads.png differ diff --git a/tests/results/longevity/1.3.0/plus-memory.png b/tests/results/longevity/1.3.0/plus-memory.png new file mode 100644 index 0000000000..dd2a42eaea Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-memory.png differ diff --git a/tests/results/longevity/1.3.0/plus-ngf-memory.png b/tests/results/longevity/1.3.0/plus-ngf-memory.png new file mode 100644 index 0000000000..b16bb39353 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-ngf-memory.png differ diff --git a/tests/results/longevity/1.3.0/plus-reload-time.png b/tests/results/longevity/1.3.0/plus-reload-time.png new file mode 100644 index 0000000000..426d129ac2 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-reload-time.png differ diff --git a/tests/results/longevity/1.3.0/plus-reloads.png b/tests/results/longevity/1.3.0/plus-reloads.png new file mode 100644 index 0000000000..19c1ab3524 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-reloads.png differ diff --git a/tests/results/longevity/1.3.0/plus-status.png b/tests/results/longevity/1.3.0/plus-status.png new file mode 100644 index 0000000000..9221b798d4 Binary files /dev/null and b/tests/results/longevity/1.3.0/plus-status.png differ diff --git a/tests/results/longevity/1.3.0/plus.md b/tests/results/longevity/1.3.0/plus.md new file mode 100644 index 0000000000..ee67310cd6 --- /dev/null +++ b/tests/results/longevity/1.3.0/plus.md @@ -0,0 +1,133 @@ +# Results + +## Test environment + +NGINX Plus: true + +GKE Cluster: + +- Node count: 3 +- k8s version: v1.28.9-gke.1000000 +- vCPUs per node: 2 +- RAM per node: 4019172Ki +- Max pods per node: 110 +- Zone: us-central1-c +- Instance Type: e2-medium +- NGF pod name -- ngf-longevity-nginx-gateway-fabric-c4db4c565-8rhdn + +## Traffic + +HTTP: + +```text +Running 5760m test @ http://cafe.example.com/coffee + 2 threads and 100 connections + Thread Stats Avg Stdev Max +/- Stdev + Latency 192.36ms 124.47ms 1.99s 66.49% + Req/Sec 279.51 187.21 1.88k 65.53% + 188805938 requests in 5760.00m, 65.81GB read + Socket errors: connect 0, read 22, write 4, timeout 42 + Non-2xx or 3xx responses: 3 +Requests/sec: 546.31 +Transfer/sec: 199.68KB +``` + +HTTPS: + +```text +Running 5760m test @ https://cafe.example.com/tea + 2 threads and 100 connections + Thread Stats Avg Stdev Max +/- Stdev + Latency 192.58ms 124.49ms 2.00s 66.48% + Req/Sec 278.96 186.56 1.76k 65.59% + 188455380 requests in 5760.00m, 64.69GB read + Socket errors: connect 10, read 24, write 0, timeout 60 + Non-2xx or 3xx responses: 5 +Requests/sec: 545.30 +Transfer/sec: 196.26KB +``` + +Note: Non-2xx or 3xx responses correspond to the error in NGINX log, see below. + +### Logs + +nginx-gateway: + +a lot of expected "usage reporting not enabled" errors. + +And: +```text +failed to start control loop: leader election lost + +4 x +failed to start control loop: error setting initial control plane configuration: NginxGateway nginx-gateway/ngf-longevity-config not found: failed to get API group resources: unable to retrieve the complete list of server APIs: gateway.nginx.org/v1alpha1: Get "https://10.61.192.1:443/apis/gateway.nginx.org/v1alpha1?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) +``` + +Those errors correspond to losing connectivity with the Kubernetes API and NGF container restarts (5). Such logs +are expected in that case. + +nginx: + +```text +2024/06/01 21:34:09 [error] 104#104: *115862644 no live upstreams while connecting to upstream, client: 10.128.0.112, server: cafe.example.com, request: "GET /tea HTTP/1.1", upstream: "http://longevity_tea_80/tea", host: "cafe.example.com" +2024/06/03 12:01:07 [error] 105#105: *267137988 no live upstreams while connecting to upstream, client: 10.128.0.112, server: cafe.example.com, request: "GET /coffee HTTP/1.1", upstream: "http://longevity_coffee_80/coffee", host: "cafe.example.com" +``` + +8 errors like that occurred at different times. They occurred when backend pods were updated. Not clear why that happens. +Because number of errors is small compared with total handled requests, no need to further +investigate unless we see it in the future again at larger volume. + +### Key Metrics + +#### Containers memory + +![plus-memory.png](plus-memory.png) + +Drop in NGF memory usage in the beginning corresponds to the nginx-gateway container restarts. + +Drop in NGINX memory usage corresponds to the end of traffic generation. + +#### NGF Container Memory + +![plus-ngf-memory.png](plus-ngf-memory.png) + +### Containers CPU + +![plus-cpu.png](plus-cpu.png) + +Drop in NGINX CPU usage corresponds to the end of traffic generation. + +### NGINX Plus metrics + +![plus-status.png](plus-status.png) + +Drop in request corresponds to the end of traffic generation. + +### Reloads + +Rate of reloads - successful and errors: + +![plus-reloads.png](plus-reloads.png) + +Note: compared to NGINX, we don't have as many reloads here, because NGF uses NGINX Plus API to reconfigure NGINX +for endpoints changes. + +No reloads finished with an error. + +Reload time distribution - counts: + +![plus-reload-time.png](plus-reload-time.png) + +Reload related metrics at the end: + +![plus-final-reloads.png](plus-final-reloads.png) + +All successful reloads took less than 1 seconds, with most under 0.5 second. + +## Comparison with previous runs + +Graphs look similar to 1.2.0 results. +As https://github.com/nginxinc/nginx-gateway-fabric/issues/1112 was fixed, we no longer see the corresponding +reload spikes. + +Memory usage is flat, but ~1 Mb higher than in 1.2.0.