|
| 1 | +# Results for v1.0.0 |
| 2 | + |
| 3 | +<!-- TOC --> |
| 4 | +- [Results for v1.0.0](#results-for-v100) |
| 5 | + - [Versions](#versions) |
| 6 | + - [Tests](#tests) |
| 7 | + - [Restart nginx-gateway container](#restart-nginx-gateway-container) |
| 8 | + - [Restart NGINX container](#restart-nginx-container) |
| 9 | + - [Restart Node with draining](#restart-node-with-draining) |
| 10 | + - [Restart Node without draining](#restart-node-without-draining) |
| 11 | + - [Future Improvements](#future-improvements) |
| 12 | +<!-- TOC --> |
| 13 | + |
| 14 | + |
| 15 | +## Versions |
| 16 | + |
| 17 | +NGF version: |
| 18 | + |
| 19 | +```text |
| 20 | +commit: 72b6c6ef8915c697626eeab88fdb6a3ce15b8da0 |
| 21 | +date: 2023-10-02T13:13:08Z |
| 22 | +version: edge |
| 23 | +``` |
| 24 | + |
| 25 | +with NGINX: |
| 26 | + |
| 27 | +```text |
| 28 | +nginx/1.25.2 |
| 29 | +built by gcc 12.2.1 20220924 (Alpine 12.2.1_git20220924-r10) |
| 30 | +OS: Linux 5.15.49-linuxkit-pr |
| 31 | +``` |
| 32 | + |
| 33 | + |
| 34 | +Kubernetes: |
| 35 | + |
| 36 | +```text |
| 37 | +Server Version: version.Info{Major:"1", Minor:"28", |
| 38 | +GitVersion:"v1.28.0", |
| 39 | +GitCommit:"855e7c48de7388eb330da0f8d9d2394ee818fb8d", |
| 40 | +GitTreeState:"clean", BuildDate:"2023-08-15T21:26:40Z", |
| 41 | +GoVersion:"go1.20.7", Compiler:"gc", |
| 42 | +Platform:"linux/arm64"} |
| 43 | +``` |
| 44 | + |
| 45 | +## Tests |
| 46 | + |
| 47 | +### Restart nginx-gateway container |
| 48 | +Passes test with no errors. |
| 49 | + |
| 50 | +### Restart NGINX container |
| 51 | +The NGF Pod was unable to recover after sending a SIGKILL signal to the NGINX master process. |
| 52 | +The following appeared in the NGINX logs: |
| 53 | + |
| 54 | +```text |
| 55 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-config-version.sock failed (98: Address in use) |
| 56 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address in use) |
| 57 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-500-server.sock failed (98: Address in use) |
| 58 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 59 | +2023/10/10 22:46:54 [notice] 141#141: try again to bind() after 500ms |
| 60 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-config-version.sock failed (98: Address in use) |
| 61 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address in use) |
| 62 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-500-server.sock failed (98: Address in use) |
| 63 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 64 | +2023/10/10 22:46:54 [notice] 141#141: try again to bind() after 500ms |
| 65 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-config-version.sock failed (98: Address in use) |
| 66 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address in use) |
| 67 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-500-server.sock failed (98: Address in use) |
| 68 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 69 | +2023/10/10 22:46:54 [notice] 141#141: try again to bind() after 500ms |
| 70 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-config-version.sock failed (98: Address in use) |
| 71 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address in use) |
| 72 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-500-server.sock failed (98: Address in use) |
| 73 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 74 | +2023/10/10 22:46:54 [notice] 141#141: try again to bind() after 500ms |
| 75 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-config-version.sock failed (98: Address in use) |
| 76 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address in use) |
| 77 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/lib/nginx/nginx-500-server.sock failed (98: Address in use) |
| 78 | +2023/10/10 22:46:54 [emerg] 141#141: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 79 | +2023/10/10 22:46:54 [notice] 141#141: try again to bind() after 500ms |
| 80 | +2023/10/10 22:46:54 [emerg] 141#141: still could not bind() |
| 81 | +``` |
| 82 | + |
| 83 | +Issue Filed: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108 |
| 84 | + |
| 85 | + |
| 86 | +### Restart Node with draining |
| 87 | +Passes test with no errors. |
| 88 | + |
| 89 | +### Restart Node without draining |
| 90 | +The NGF Pod was unable to recover the majority of times after running `docker restart kind-control-plane`. |
| 91 | + |
| 92 | +The following appeared in the NGINX logs: |
| 93 | + |
| 94 | +```text |
| 95 | +2023/10/10 22:57:05 [emerg] 140#140: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 96 | +2023/10/10 22:57:05 [notice] 140#140: try again to bind() after 500ms |
| 97 | +2023/10/10 22:57:05 [emerg] 140#140: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 98 | +2023/10/10 22:57:05 [notice] 140#140: try again to bind() after 500ms |
| 99 | +2023/10/10 22:57:05 [emerg] 140#140: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 100 | +2023/10/10 22:57:05 [notice] 140#140: try again to bind() after 500ms |
| 101 | +2023/10/10 22:57:05 [emerg] 140#140: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 102 | +2023/10/10 22:57:05 [notice] 140#140: try again to bind() after 500ms |
| 103 | +2023/10/10 22:57:05 [emerg] 140#140: bind() to unix:/var/run/nginx/nginx-status.sock failed (98: Address in use) |
| 104 | +2023/10/10 22:57:05 [notice] 140#140: try again to bind() after 500ms |
| 105 | +2023/10/10 22:57:05 [emerg] 140#140: still could not bind() |
| 106 | +``` |
| 107 | + |
| 108 | +The following appeared in the NGF logs: |
| 109 | + |
| 110 | +```text |
| 111 | +{"level":"info","ts":"2023-10-10T22:57:05Z","msg":"Starting NGINX Gateway Fabric in static mode","version":"edge","commit":"b3fbf98d906f60ce66d70d7a2373c4b12b7d5606","date":"2023-10-10T22:02:06Z"} |
| 112 | +Error: failed to start control loop: cannot create and register metrics collectors: cannot create NGINX status metrics collector: failed to get http://config-status/stub_status: Get "http://config-status/stub_status": dial unix /var/run/nginx/nginx-status.sock: connect: connection refused |
| 113 | +Usage: |
| 114 | + gateway static-mode [flags] |
| 115 | +
|
| 116 | +Flags: |
| 117 | + -c, --config string The name of the NginxGateway resource to be used for this controller's dynamic configuration. Lives in the same Namespace as the controller. (default "") |
| 118 | + --gateway string The namespaced name of the Gateway resource to use. Must be of the form: NAMESPACE/NAME. If not specified, the control plane will process all Gateways for the configured GatewayClass. However, among them, it will choose the oldest resource by creation timestamp. If the timestamps are equal, it will choose the resource that appears first in alphabetical order by {namespace}/{name}. |
| 119 | + --health-disable Disable running the health probe server. |
| 120 | + --health-port int Set the port where the health probe server is exposed. Format: [1024 - 65535] (default 8081) |
| 121 | + -h, --help help for static-mode |
| 122 | + --leader-election-disable Disable leader election. Leader election is used to avoid multiple replicas of the NGINX Gateway Fabric reporting the status of the Gateway API resources. If disabled, all replicas of NGINX Gateway Fabric will update the statuses of the Gateway API resources. |
| 123 | + --leader-election-lock-name string The name of the leader election lock. A Lease object with this name will be created in the same Namespace as the controller. (default "nginx-gateway-leader-election-lock") |
| 124 | + --metrics-disable Disable exposing metrics in the Prometheus format. |
| 125 | + --metrics-port int Set the port where the metrics are exposed. Format: [1024 - 65535] (default 9113) |
| 126 | + --metrics-secure-serving Enable serving metrics via https. By default metrics are served via http. Please note that this endpoint will be secured with a self-signed certificate. |
| 127 | + --update-gatewayclass-status Update the status of the GatewayClass resource. (default true) |
| 128 | +
|
| 129 | +Global Flags: |
| 130 | + --gateway-ctlr-name string The name of the Gateway controller. The controller name must be of the form: DOMAIN/PATH. The controller's domain is 'gateway.nginx.org' (default "") |
| 131 | + --gatewayclass string The name of the GatewayClass resource. Every NGINX Gateway Fabric must have a unique corresponding GatewayClass resource. (default "") |
| 132 | +
|
| 133 | +failed to start control loop: cannot create and register metrics collectors: cannot create NGINX status metrics collector: failed to get http://config-status/stub_status: Get "http://config-status/stub_status": dial unix /var/run/nginx/nginx-status.sock: connect: connection refused |
| 134 | +``` |
| 135 | + |
| 136 | +Important to note that occasionally the test will pass and the NGF Pod would recover gracefully. |
| 137 | + |
| 138 | +Related to this issue: https://github.com/nginxinc/nginx-gateway-fabric/issues/1108 |
| 139 | + |
| 140 | +## Future Improvements |
| 141 | + |
| 142 | +- None |
0 commit comments