Skip to content

Commit 8d5f023

Browse files
authored
Remove webhook validation code (#1590)
* Remove webhook validation code
1 parent 6becde2 commit 8d5f023

File tree

12 files changed

+281
-537
lines changed

12 files changed

+281
-537
lines changed

design/resource-validation.md

Lines changed: 1 addition & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -91,36 +91,9 @@ Design a validation mechanism for Gateway API resources.
9191

9292
## Design
9393

94-
We will introduce two validation methods to be run by NGF control plane:
95-
96-
1. Re-run of the Gateway API webhook validation
97-
2. NGF-specific field validation
98-
99-
### Re-run of Webhook Validation
100-
101-
Before processing a resource, NGF will validate it using the functions from
102-
the [validation package](https://github.com/kubernetes-sigs/gateway-api/tree/fa4b0a519b30a33b205ac0256876afc1456f2dd3/apis/v1/validation)
103-
from the Gateway API. This will ensure that the webhook validation cannot be bypassed (it can be bypassed if the webhook
104-
is not installed, misconfigured, or running a different version), and it will allow us to avoid repeating the same
105-
validation in our code.
106-
107-
If a resource is invalid:
108-
109-
- NGF will not process it -- it will treat it as if the resource didn't exist. This also means that if the resource was
110-
updated from a valid to an invalid state, NGF will also ignore any previous valid state. For example, it will remove
111-
the generation configuration for an HTTPRoute resource.
112-
- NGF will report the validation error as a
113-
Warning [Event](https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/event-v1/)
114-
for that resource. The Event message will describe the error and explain that the resource was ignored. We chose to
115-
report an Event instead of updating the status, because to update the status, NGF first needs to look inside the
116-
resource to determine whether it belongs to it or not. However, since the webhook validation applies to all parts of
117-
the spec of resource, it means NGF has to look inside the invalid resource and parse potentially invalid parts. To
118-
avoid that, NGF will report an Event. The owner of the resource will be able to see the Event.
119-
- NGF will also report the validation error in the NGF logs.
120-
12194
### NGF-specific validation
12295

123-
After re-running the webhook validation, NGF will run NGF-specific validation written in go.
96+
NGF runs NGF-specific validation written in go.
12497

12598
NGF-specific validation will:
12699

@@ -132,7 +105,6 @@ NGF-specific validation will not include:
132105

133106
- *All* validation done by CRDs. NGF will only repeat the validation that addresses (1) and (2) in the list above with
134107
extra rules required by NGINX but missing in the CRDs. For example, NGF will not ensure the limits of field values.
135-
- The validation done by the webhook (because it is done in the previous step).
136108

137109
If a resource is invalid, NGF will report the error in its status.
138110

@@ -146,7 +118,6 @@ following methods in order of their appearance in the table.
146118
| Name | Type | Component | Scope | Feedback loop for errors | Can be bypassed? |
147119
|------------------------------|----------------------------|-----------------------|-------------------------|----------------------------------------------------------------------------------|--------------------------------|
148120
| CRD validation | OpenAPI and CEL validation | Kubernetes API server | Structure, field values | Kubernetes API server returns any errors a response for an API call. | Yes, if the CRDs are modified. |
149-
| Re-run of webhook validation | Go code | NGF control plane | Field values | Errors are reported as Event for the resource. | No |
150121
| NGF-specific validation | Go code | NGF control plane | Field values | Errors are reported in the status of a resource after its creation/modification. | No |
151122

152123

@@ -156,7 +127,6 @@ following methods in order of their appearance in the table.
156127
|------------------------------|---------|-----------------------|-------------------------|----------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
157128
| CRD validation | OpenAPI | Kubernetes API server | Structure, field values | Kubernetes API server returns any errors a response for an API call. | Yes, if the CRDs are modified. |
158129
| Webhook validation | Go code | Gateway API webhook | Field values | Kubernetes API server returns any errors a response for an API call. | Yes, if the webhook is not installed, misconfigured, or running a different version. |
159-
| Re-run of webhook validation | Go code | NGF control plane | Field values | Errors are reported as Event for the resource. | No |
160130
| NGF-specific validation | Go code | NGF control plane | Field values | Errors are reported in the status of a resource after its creation/modification. | No |
161131

162132

@@ -182,14 +152,6 @@ We will not introduce any NGF webhook in the cluster (it adds operational comple
182152
source of potential downtime -- a webhook failure disables CRUD operations on the relevant resources) unless we find
183153
good reasons for that.
184154

185-
### Upgrades
186-
187-
Since NGF will use the validation package from the Gateway API project, when a new release happens, we will need to
188-
upgrade the dependency and release a new version of NGF, provided that the validation code changed. However, if it did
189-
not change, we do not need to release a new version. Note: other things from a new Gateway API release might prompt us
190-
to release a new version like supporting a new field. See also
191-
[GEP-922](https://gateway-api.sigs.k8s.io/geps/gep-922/#).
192-
193155
### Reliability
194156

195157
NGF processes two kinds of transactions:

internal/mode/static/state/change_processor.go

Lines changed: 1 addition & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ import (
1616
"sigs.k8s.io/controller-runtime/pkg/client"
1717
"sigs.k8s.io/controller-runtime/pkg/client/apiutil"
1818
v1 "sigs.k8s.io/gateway-api/apis/v1"
19-
gwapivalidation "sigs.k8s.io/gateway-api/apis/v1/validation"
2019
"sigs.k8s.io/gateway-api/apis/v1alpha2"
2120
"sigs.k8s.io/gateway-api/apis/v1beta1"
2221

@@ -25,13 +24,6 @@ import (
2524
"github.com/nginxinc/nginx-gateway-fabric/internal/mode/static/state/validation"
2625
)
2726

28-
const (
29-
validationErrorLogMsg = "the resource failed validation: Gateway API CEL validation (Kubernetes 1.25+) " +
30-
"by the Kubernetes API server and/or the Gateway API webhook validation (if installed) failed to reject " +
31-
"the resource with the error; make sure Gateway API CRDs include CEL validation and/or (if installed) the " +
32-
"webhook is running correctly."
33-
)
34-
3527
// ChangeType is the type of change that occurred based on a k8s object event.
3628
type ChangeType int
3729

@@ -193,35 +185,8 @@ func NewChangeProcessorImpl(cfg ChangeProcessorConfig) *ChangeProcessorImpl {
193185
},
194186
)
195187

196-
updater := newValidatingUpsertUpdater(
197-
trackingUpdater,
198-
cfg.EventRecorder,
199-
func(obj client.Object) error {
200-
// Add the validation for Gateway API resources which the webhook validates
201-
202-
var err error
203-
switch o := obj.(type) {
204-
// We don't validate GatewayClass or ReferenceGrant, because as of the latest version,
205-
// the webhook doesn't validate them.
206-
// It only validates a GatewayClass update that requires the previous version of the resource,
207-
// which NGF cannot reliably provide - for example, after NGF restarts).
208-
// https://github.com/kubernetes-sigs/gateway-api/blob/v1.0.0/apis/v1/validation/gatewayclass.go#L28
209-
case *v1.Gateway:
210-
err = gwapivalidation.ValidateGateway(o).ToAggregate()
211-
case *v1.HTTPRoute:
212-
err = gwapivalidation.ValidateHTTPRoute(o).ToAggregate()
213-
}
214-
215-
if err != nil {
216-
return fmt.Errorf(validationErrorLogMsg+": %w", err)
217-
}
218-
219-
return nil
220-
},
221-
)
222-
223188
processor.getAndResetClusterStateChanged = trackingUpdater.getAndResetChangedStatus
224-
processor.updater = updater
189+
processor.updater = trackingUpdater
225190

226191
return processor
227192
}

0 commit comments

Comments
 (0)