-
Notifications
You must be signed in to change notification settings - Fork 542
GEP-1742: HTTPRoute Timeouts API #1997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 26 commits
accce11
0525880
7696daa
09ca210
18b6894
2799811
90606a8
f07a259
ad0e9c7
db6688b
224d342
7a01f3e
29cc9f0
26e451a
7b6315b
bef97ed
7e20c3d
650874d
659d478
3c0e8e6
6c01fa8
4e7c538
2a1c178
f9181ec
12c731a
ab8283d
bad35be
f0ac5ac
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,7 +1,7 @@ | ||||||
# GEP-1742: Timeouts | ||||||
# GEP-1742: HTTPRoute Timeouts | ||||||
|
||||||
* Issue: [#1742](https://github.com/kubernetes-sigs/gateway-api/issues/1742) | ||||||
* Status: Provisional | ||||||
* Status: Implementable | ||||||
frankbu marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How did we end up back at There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @shaneutt, it's been a couple of weeks now, and we've changed it to be Extended support, so I'm suggesting that we can make it Implementable now, so that we can also merge the related interfaces PR (#2013). See #1997 (comment) Let me know if you think it's still too soon, and I'll switch it back to Provisional. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok I think I just missed the comment. I can be convinced that we should move forward with /cc @robscott @youngnick There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think going direct to implementable makes sense here, but you'll also need to make a corresponding change to update the status in the navigation as well: Line 82 in 8afce4e
|
||||||
|
||||||
(See status definitions [here](overview.md#status).) | ||||||
|
||||||
|
@@ -12,12 +12,12 @@ timeouts for different types of connection. | |||||
|
||||||
## Goals | ||||||
|
||||||
- Create some method to configure some timeouts | ||||||
- Create some method to configure some timeouts. | ||||||
- Timeout config must be applicable to most if not all Gateway API implementations. | ||||||
|
||||||
## Non-Goals | ||||||
|
||||||
- TBD | ||||||
- A standard API for every possible timeout that implementations may support. | ||||||
|
||||||
## Introduction | ||||||
|
||||||
|
@@ -298,16 +298,182 @@ Could not find any HTTP specific timeouts. PRs welcomed. 😊 | |||||
|
||||||
Could not find any HTTP specific timeouts. PRs welcomed. 😊 | ||||||
|
||||||
|
||||||
## API | ||||||
|
||||||
TBD. | ||||||
The above diagrams show that there are many different kinds of configurable timeouts | ||||||
supported by Gateway implementations: connect, idle, request, upstream, downstream. | ||||||
Although there may be opportunity for the specification of a common API for more of | ||||||
them in the future, this GEP will focus on the L7 timeouts in HTTPRoutes that are | ||||||
most valuable to clients. | ||||||
Comment on lines
+306
to
+307
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think one of the most important questions here is where the timeout config belongs. This proposes HTTPRouteRule, but I think that both HTTPBackendRef and/or a new policy to extend Service would be compelling alternatives. It would be helpful if this GEP described why HTTPRouteRule is proposed and why the others alternatives were not chosen. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added some text in the Alternatives section. |
||||||
|
||||||
From the above analysis, it appears that most implementations are capable of | ||||||
supporting the configuration of simple client downstream request timeouts on HTTPRoute | ||||||
rules. This is a relatively small addition that would benefit many users. | ||||||
|
||||||
Some implementations support configuring a timeout for individual backend requests, | ||||||
separate from the overall client request timeout. This is particularly useful if a | ||||||
client HTTP request to a gateway can result in more than one call from the gateway | ||||||
to the destination backend service, for example, if automatic retries are supported. | ||||||
Adding support for this would also benefit many users. | ||||||
|
||||||
### Timeout values | ||||||
|
||||||
There are 2 kinds of timeouts that can be configured in an `HTTPRouteRule`: | ||||||
|
||||||
1. `timeouts.request` is the timeout for the Gateway API implementation to send a | ||||||
response to a client HTTP request. Whether the gateway starts the timeout before | ||||||
or after the entire client request stream has been received, is implementation dependent. | ||||||
This field is optional `Extended` support. | ||||||
|
||||||
1. `timeouts.backendRequest` is a timeout for a single request from the gateway to a backend. | ||||||
This field is optional `Extended` support. Typically used in conjuction with retry configuration, | ||||||
if supported by an implementation. | ||||||
Note that retry configuration will be the subject of a separate GEP (GEP-1731). | ||||||
|
||||||
```mermaid | ||||||
sequenceDiagram | ||||||
participant C as Client | ||||||
participant P as Proxy | ||||||
participant U as Upstream | ||||||
C->>P: Connection Started | ||||||
note left of P: timeouts.request start time (min) | ||||||
C->>P: Starts sending Request | ||||||
C->>P: Finishes Headers | ||||||
C->>P: Finishes request | ||||||
note left of P: timeouts.request start time (max) | ||||||
P->>U: Connection Started | ||||||
note right of P: timeouts.backendRequest start time | ||||||
P->>U: Starts sending Request | ||||||
P->>U: Finishes request | ||||||
P->>U: Finishes Headers | ||||||
U->>P: Starts Response | ||||||
U->>P: Finishes Headers | ||||||
note right of P: timeouts.backendRequest end time | ||||||
note left of P: timeouts.request end time | ||||||
U->>P: Finishes Response | ||||||
note right of P: Repeat if retry | ||||||
P->>C: Starts Response | ||||||
P->>C: Finishes Headers | ||||||
P->>C: Finishes Response | ||||||
Note right of P: Repeat if connection sharing | ||||||
U->>C: Connection ended | ||||||
``` | ||||||
|
||||||
Both timeout fields are string duration values as specified by | ||||||
[Golang time.ParseDuration](https://pkg.go.dev/time#ParseDuration) and MUST be >= 1ms | ||||||
or 0 to disable (no timeout). | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you clarify if the client request covers the entire connection from client to backend, or just the downstream request from client to gateway? There is also at least one tunnel timeout (HAProxy) and perhaps some other timeouts that cover the entire connection, not just the downstream and upstream halves of the connection. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a diagram in the "Timeout values" section that shows the relationships, but you need to look at the rendered markdown to see it. https://github.com/kubernetes-sigs/gateway-api/blob/7e20c3d25368b2c91d0952dfbd842f46e3324b36/geps/gep-1742.md#timeout-values There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did see the diagram, yet still wanted to get the explanation in words and in the godoc of the implementation section for I don't see any timeout that would cover the entire transaction, unless |
||||||
### GO | ||||||
|
||||||
```go | ||||||
type HTTPRouteRule struct { | ||||||
// Timeouts defines the timeouts that can be configured for an HTTP request. | ||||||
// | ||||||
// Support: Extended | ||||||
// | ||||||
// +optional | ||||||
// <gateway:experimental> | ||||||
Timeouts *HTTPRouteTimeouts `json:"timeouts,omitempty"` | ||||||
|
||||||
// ... | ||||||
} | ||||||
|
||||||
// HTTPRouteTimeouts defines timeouts that can be configured for an HTTPRoute. | ||||||
// Timeout values are formatted like 1h/1m/1s/1ms as parsed by Golang time.ParseDuration | ||||||
// and MUST BE >= 1ms or 0 to disable (no timeout). | ||||||
type HTTPRouteTimeouts struct { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would also add a name and/or type to the HTTPRouteTimeouts in case there are multiple needed for any given route. For example, maybe an idle timeout would be a different duration than a connect timeout. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I understand what you're suggesting here. If we wanted (in the future) to add another timeout (e.g., idle timeout), wouldn't we just add another field to
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, ok, so maybe you need to clarify in Goal and Non-Goals which timeout values you're addressing. If you call it "request", that doesn't mean much. What transactions are covered in "request"? |
||||||
// Request specifies the duration for processing an HTTP client request after which the | ||||||
// gateway will time out if unable to send a response. | ||||||
// | ||||||
// For example, setting the `rules.timeouts.request` field to the value `10s` in an | ||||||
// `HTTPRoute` will cause a timeout if a client request is taking longer than 10 seconds | ||||||
// to complete. | ||||||
// | ||||||
// This timeout is intended to cover as close to the whole request-response transaction | ||||||
// as possible although an implementation MAY choose to start the timeout after the entire | ||||||
// request stream has been received instead of immediately after the transaction is | ||||||
// initiated by the client. | ||||||
// | ||||||
// When this field is unspecified, request timeout behavior is implementation-dependent. | ||||||
// | ||||||
// Support: Extended | ||||||
// | ||||||
// +optional | ||||||
// +kubebuilder:validation:Format=duration | ||||||
Request *metav1.Duration `json:"request,omitempty"` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tend to prefer keeping it short-and-sweet (Request) but will switch if others also think ClientRequest would be better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I generally prefer greater specificity too, but this timeout as designed is broadly similar to what's called a "request timeout" by a few of the proxy implementations at least, so I think in this case, the more general term is okay. |
||||||
|
||||||
// BackendRequest specifies a timeout for an individual request from the gateway | ||||||
// to a backend service. This covers the time from when the request first starts being | ||||||
// sent from the gateway to when the full response has been received from the backend. | ||||||
// | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can see the definition in the diagram below, but this struct needs to have a full definition of what BackendRequest means. That is, there should be something like this here: The BackendRequest timeout covers the time from when the first request starts being sent from the Gateway to when the full response has been received from the backend. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||||||
// An entire client HTTP transaction with a gateway, covered by the Request timeout, | ||||||
// may result in more than one call from the gateway to the destination backend service, | ||||||
// for example, if automatic retries are supported. | ||||||
// | ||||||
// Because the Request timeout encompasses the BackendRequest timeout, | ||||||
// the value of BackendRequest defaults to and must be <= the value of Request timeout. | ||||||
// | ||||||
// Support: Extended | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if you have 2 routes with same backend, and define 2 different backend timeouts ? Will it mean that the backend request will have different timeouts depending on path ? Are all implementations capable of doing this ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, the backend request timeout is client request specific. Envoy supports this (perTryTimeout) but all implementations probably not, so that's why it's Extended (instead of Core) support. |
||||||
// | ||||||
// +optional | ||||||
// +kubebuilder:validation:Format=duration | ||||||
BackendRequest *metav1.Duration `json:"backendRequest,omitempty"` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should document it should be LET Request timeout? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @hzxuzhonghu I don't quite understand what you are suggesting. Can you clarify please? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Oh, did you mean to say LESS? If so, good idea. Done. |
||||||
} | ||||||
``` | ||||||
|
||||||
### YAML | ||||||
|
||||||
```yaml | ||||||
apiVersion: gateway.networking.k8s.io/v1beta1 | ||||||
kind: HTTPRoute | ||||||
metadata: | ||||||
name: timeout-example | ||||||
spec: | ||||||
... | ||||||
rules: | ||||||
- backendRefs: | ||||||
- name: some-service | ||||||
port: 8080 | ||||||
timeouts: | ||||||
request: 10s | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed to say format is as time.ParseDuration There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thats probably my fault, I added a link to the POC PR mentioning ISO 8601 but didnt read much further into the actual format apimachinery metav1.Duration is pretty light on documentation too, can maybe just rip the field documentation from go if we need to be specific: https://pkg.go.dev/maze.io/x/duration#ParseDuration |
||||||
backendRequest: 2s | ||||||
``` | ||||||
|
||||||
## Alternatives | ||||||
|
||||||
(List other design alternatives and why we did not go in that | ||||||
direction) | ||||||
Timeouts could be configured using policy attachments or in objects other than `HTTPRouteRule`. | ||||||
|
||||||
### Policy Attachment | ||||||
|
||||||
Instead of configuring timeouts directly on an API object, they could be configured using policy | ||||||
attachments. The advantage to this approach would be that timeout policies can be not only | ||||||
configured for an `HTTPRouteRule`, but can also be added/overriden at a more fine | ||||||
(e.g., `HTTPBackendRef`) or course (e.g. `HTTPRoute`) level of granularity. | ||||||
|
||||||
The downside, however, is complexity introduced for the most common use case, adding a simple | ||||||
timeout for an HTTP request. Setting a single field in the route rule, instead of needing to | ||||||
create a policy resource, for this simple case seems much better. | ||||||
|
||||||
In the future, we could consider using policy attachments to configure less common kinds of | ||||||
timeouts that may be needed, but it would probably be better to instead extend the proposed API | ||||||
to support those timeouts as well. | ||||||
|
||||||
The default values of the proposed timeout fields could also be overriden | ||||||
using policy attachments in the future. For example, a policy attachment could be used to set the | ||||||
default value of `rules.timeouts.request` for all routes under an `HTTPRoute` or `Gateway`. | ||||||
|
||||||
### Other API Objects | ||||||
|
||||||
The new timeouts field could be added to a different API struct, instead of `HTTPRouteRule`. | ||||||
|
||||||
Putting it on an `HTTPBackendRef`, for example, would allow users to set different timeouts for different | ||||||
backends. This is a feature that we believe has not been requested by existing proxy or service mesh | ||||||
clients and is also not implementable using available timeouts of most proxies. | ||||||
|
||||||
Another alternative is to move the timeouts configuration up a level in the API to `HTTPRoute`. This | ||||||
would be convenient when a user wants the same timeout on all rules, but would be overly restrictive. | ||||||
Using policy attachments to override the default timeout value for all rules, as described in the | ||||||
previous section, is likely a better way to handle timeout configuration above the route rule level. | ||||||
|
||||||
## References | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -79,12 +79,13 @@ nav: | |
- Declined: | ||
- geps/gep-735.md | ||
- Provisional: | ||
- geps/gep-1742.md | ||
- geps/gep-1426.md | ||
- geps/gep-1324.md | ||
- geps/gep-1282.md | ||
- Prototyping: | ||
- geps/gep-1709.md | ||
- Implementable: | ||
- geps/gep-1742.md | ||
- geps/gep-1426.md | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catch that we forgot to move this one as well previously, thanks. |
||
- Experimental: | ||
- geps/gep-1748.md | ||
- geps/gep-1323.md | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first read this I wondered why it only focused on HTTPRoute, but later I realized the original GEP specified that focus. It would be a good idea to change the title to "HTTPRoute Timeouts API", or else mention in Non-Goals that other route types are not covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Title changed