|
| 1 | +# Results for v1.1.0 |
| 2 | + |
| 3 | +<!-- TOC --> |
| 4 | + |
| 5 | +- [Results for v1.1.0](#results-for-v110) |
| 6 | + - [Versions](#versions) |
| 7 | + - [Traffic](#traffic) |
| 8 | + - [NGF](#ngf) |
| 9 | + - [Error Log](#error-log) |
| 10 | + - [NGINX](#nginx) |
| 11 | + - [Error Log](#error-log-1) |
| 12 | + - [Access Log](#access-log) |
| 13 | + - [Key Metrics](#key-metrics) |
| 14 | + - [Containers memory](#containers-memory) |
| 15 | + - [Containers CPU](#containers-cpu) |
| 16 | + - [NGINX metrics](#nginx-metrics) |
| 17 | + - [Reloads](#reloads) |
| 18 | + - [Existing Issues still relevant](#existing-issues-still-relevant) |
| 19 | + |
| 20 | +<!-- TOC --> |
| 21 | + |
| 22 | +## Versions |
| 23 | + |
| 24 | +NGF version: |
| 25 | + |
| 26 | +```text |
| 27 | +commit: "21a2507d3d25ac0428384dce2c042799ed28b856" |
| 28 | +date: "2023-12-06T23:47:17Z" |
| 29 | +version: "edge" |
| 30 | +``` |
| 31 | + |
| 32 | +with NGINX: |
| 33 | + |
| 34 | +```text |
| 35 | +nginx/1.25.3 |
| 36 | +built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10) |
| 37 | +OS: Linux 5.15.109+ |
| 38 | +``` |
| 39 | + |
| 40 | +Kubernetes: |
| 41 | + |
| 42 | +```text |
| 43 | +Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.5-gke.200", GitCommit:"f9aad8e51abb509136cb82b4a00cc3d77d3d70d9", GitTreeState:"clean", BuildDate:"2023-08-26T23:26:22Z", GoVersion:"go1.20.7 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"} |
| 44 | +``` |
| 45 | + |
| 46 | +## Traffic |
| 47 | + |
| 48 | +HTTP: |
| 49 | + |
| 50 | +```text |
| 51 | +wrk -t2 -c100 -d96h http://cafe.example.com/coffee |
| 52 | +
|
| 53 | +Running 5760m test @ http://cafe.example.com/coffee |
| 54 | + 2 threads and 100 connections |
| 55 | + Thread Stats Avg Stdev Max +/- Stdev |
| 56 | + Latency 182.30ms 146.48ms 2.00s 82.86% |
| 57 | + Req/Sec 306.26 204.19 2.18k 65.75% |
| 58 | + 207104807 requests in 5760.00m, 72.17GB read |
| 59 | + Socket errors: connect 0, read 362418, write 218736, timeout 19693 |
| 60 | +Requests/sec: 599.26 |
| 61 | +Transfer/sec: 218.97KB |
| 62 | +``` |
| 63 | + |
| 64 | +HTTPS: |
| 65 | + |
| 66 | +```text |
| 67 | +wrk -t2 -c100 -d96h https://cafe.example.com/tea |
| 68 | +
|
| 69 | +Running 5760m test @ https://cafe.example.com/tea |
| 70 | + 2 threads and 100 connections |
| 71 | + Thread Stats Avg Stdev Max +/- Stdev |
| 72 | + Latency 172.15ms 118.59ms 2.00s 68.16% |
| 73 | + Req/Sec 305.26 203.43 2.33k 65.34% |
| 74 | + 206387831 requests in 5760.00m, 70.81GB read |
| 75 | + Socket errors: connect 44059, read 356656, write 0, timeout 126 |
| 76 | +Requests/sec: 597.19 |
| 77 | +Transfer/sec: 214.84KB |
| 78 | +``` |
| 79 | + |
| 80 | +While there are socket errors in the output, there are no connection-related errors in NGINX logs. |
| 81 | +Further investigation is out of scope of this test. |
| 82 | + |
| 83 | +### NGF |
| 84 | + |
| 85 | +#### Error Log |
| 86 | + |
| 87 | +```text |
| 88 | +resource.type="k8s_container" |
| 89 | +resource.labels.cluster_name="ciara-1" |
| 90 | +resource.labels.namespace_name="nginx-gateway" |
| 91 | +resource.labels.container_name="nginx-gateway" |
| 92 | +severity=ERROR |
| 93 | +SEARCH("error") |
| 94 | +``` |
| 95 | + |
| 96 | +There were 36 error logs across 2 pod instances. They came in 2 almost identical batches, one on the first day of |
| 97 | +running the test, after approximately 6 hours, and the second 2.5 days later. They were both relating to leader election |
| 98 | +loss, and subsequent restart (see https://github.com/nginxinc/nginx-gateway-fabric/issues/1100). |
| 99 | + |
| 100 | +Both error batches caused the pod to restart, but not terminate. However, the first pod was terminated about 10 minutes |
| 101 | +after the first error batch and subsequent restart occurred. Exactly why this pod was terminated is not currently clear, |
| 102 | +but it looks to be a cluster event (perhaps an upgrade) as the coffee and tea pods were terminated at that time also. |
| 103 | +Strangely, there were 6 pod restarts in total of the second pod, but no other errors were observed over the test period |
| 104 | +other than what was seen above, and grepping the logs for start-up logs only produced the 2 known restarts relating to |
| 105 | +the leader election loss, plus initial start-up of both pods (4 in total). |
| 106 | + |
| 107 | +```console |
| 108 | +kubectl get pods -n nginx-gateway |
| 109 | +NAME READY STATUS RESTARTS AGE |
| 110 | +my-release-nginx-gateway-fabric-78d4b84447-4hss5 2/2 Running 6 (31h ago) 3d22h |
| 111 | +``` |
| 112 | + |
| 113 | +### NGINX |
| 114 | + |
| 115 | +#### Error Log |
| 116 | + |
| 117 | +Errors: |
| 118 | + |
| 119 | +```text |
| 120 | +resource.type=k8s_container AND |
| 121 | +resource.labels.pod_name="my-release-nginx-gateway-fabric-78d4b84447-4hss5" AND |
| 122 | +resource.labels.container_name="nginx" AND |
| 123 | +severity=ERROR AND |
| 124 | +SEARCH("`[warn]`") OR SEARCH("`[error]`") |
| 125 | +``` |
| 126 | + |
| 127 | +No entries found. |
| 128 | + |
| 129 | +#### Access Log |
| 130 | + |
| 131 | +Non-200 response codes in NGINX access logs: |
| 132 | + |
| 133 | +```text |
| 134 | +resource.type=k8s_container AND |
| 135 | +resource.labels.pod_name="my-release-nginx-gateway-fabric-78d4b84447-4hss5" AND |
| 136 | +resource.labels.container_name="nginx" |
| 137 | +"GET" "HTTP/1.1" -"200" |
| 138 | +``` |
| 139 | + |
| 140 | +No such responses. |
| 141 | + |
| 142 | +## Key Metrics |
| 143 | + |
| 144 | +### Containers memory |
| 145 | + |
| 146 | + |
| 147 | + |
| 148 | +Memory usage dropped twice which appears to correspond with the restarts seen above relating to leader election. |
| 149 | +Interestingly, before the first restart and after the second restart, memory usage sat at about 8.5MiB, but for the |
| 150 | +majority of the test run, memory usage was about 9.5-10MiB. The previous release test run also had memory usage at |
| 151 | +about 9-10MiB, but more stable usage across the duration of the test. However, there were no restarts observed in the |
| 152 | +v1.0.0 test run. I don't think there is anything to investigate here. |
| 153 | + |
| 154 | +### Containers CPU |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | +No unexpected spikes or drops. |
| 159 | + |
| 160 | +### NGINX metrics |
| 161 | + |
| 162 | +In this test, NGINX metrics were not correctly exported so no dashboards are available for these. |
| 163 | + |
| 164 | +### Reloads |
| 165 | + |
| 166 | +In this test, NGINX metrics were not correctly exported so no dashboards are available for these. |
| 167 | + |
| 168 | +Reload related metrics at the end: |
| 169 | + |
| 170 | +```text |
| 171 | +# TYPE nginx_gateway_fabric_nginx_reloads_milliseconds histogram |
| 172 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="500"} 1647 |
| 173 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="1000"} 4043 |
| 174 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="5000"} 4409 |
| 175 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="10000"} 4409 |
| 176 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="30000"} 4409 |
| 177 | +nginx_gateway_fabric_nginx_reloads_milliseconds_bucket{class="nginx",le="+Inf"} 4409 |
| 178 | +nginx_gateway_fabric_nginx_reloads_milliseconds_sum{class="nginx"} 2.701667e+06 |
| 179 | +nginx_gateway_fabric_nginx_reloads_milliseconds_count{class="nginx"} 4409 |
| 180 | +# HELP nginx_gateway_fabric_nginx_reloads_total Number of successful NGINX reloads |
| 181 | +# TYPE nginx_gateway_fabric_nginx_reloads_total counter |
| 182 | +nginx_gateway_fabric_nginx_reloads_total{class="nginx"} 4409 |
| 183 | +``` |
| 184 | + |
| 185 | +All successful reloads took less than 5 seconds, with most (>90%) under 1 second. |
| 186 | + |
| 187 | +## Existing Issues still relevant |
| 188 | + |
| 189 | +- NGF unnecessary reloads NGINX when it reconciles |
| 190 | + Secrets - https://github.com/nginxinc/nginx-gateway-fabric/issues/1112 |
| 191 | +- Use NGF Logger in Client-Go Library - https://github.com/nginxinc/nginx-gateway-fabric/issues/1101 |
0 commit comments