Skip to content

Extend troubleshooting doc #2072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 4, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions site/content/how-to/monitoring/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,80 @@ docs: "DOCS-1419"

This topic describes possible issues users might encounter when using NGINX Gateway Fabric. When possible, suggested workarounds are provided.

### General troubleshooting

When investigating a problem or requesting help, there are important data points that can be collected to help understand what issues may exist.

##### Resource status

To check the status of a resource, use `kubectl describe`. This example checks the status of the `coffee` HTTPRoute, which has an error:

```shell
kubectl describe httproutes.gateway.networking.k8s.io coffee [-n namespace]
```

```text
...
Status:
Parents:
Conditions:
Last Transition Time: 2024-05-31T17:20:51Z
Message: The route is accepted
Observed Generation: 4
Reason: Accepted
Status: True
Type: Accepted
Last Transition Time: 2024-05-31T17:20:51Z
Message: spec.rules[0].backendRefs[0].name: Not found: "bad-backend"
Observed Generation: 4
Reason: BackendNotFound
Status: False
Type: ResolvedRefs
Controller Name: gateway.nginx.org/nginx-gateway-controller
Parent Ref:
Group: gateway.networking.k8s.io
Kind: Gateway
Name: gateway
Namespace: default
Section Name: http
```

If a resource has errors relating to its configuration or relationship to other resources, they can likely be read in the status. The `ObservedGeneration` in the status should match the `ObservedGeneration` of the resource. Otherwise, this could mean that the resource hasn't been processed yet or that the status failed to update.

##### Events

Events created by NGINX Gateway Fabric or other Kubernetes components could indicate system or configuration issues. To see events:

```shell
kubectl get events [-n namespace]
```

For example, a warning event when the NginxGateway configuration CRD is deleted:

```text
kubectl -n nginx-gateway get event
LAST SEEN TYPE REASON OBJECT MESSAGE
5s Warning ResourceDeleted nginxgateway/ngf-config NginxGateway configuration was deleted; using defaults
```

##### Logs

Logs from the NGINX Gateway Fabric control plane and data plane can contain information that isn't available to status or events. These can include errors in processing or passing traffic.

To see logs for the control plane container:

```shell
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx-gateway
```

To see logs for the data plane container:

```shell
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx
```

You can see logs for a crashed or killed container by adding the `-p` flag to the above commands.

### NGINX fails to reload

#### Description
Expand Down
Loading