Skip to content

net/http: enhanced ServeMux routing #61410

Closed
@jba

Description

@jba

10 October 2023: Updated to clarify escaping: both paths and patterns are unescaped segment by segment, not as a whole. We found during implementation that this gives the behavior we would expect.


7 August 2023: updated with two changes:

  • added Request.SetPathValue
  • GET also matches HEAD

We propose to expand the standard HTTP mux's capabilities by adding two features: distinguishing requests based on HTTP method (GET, POST, ...) and supporting wildcards in the matched paths.

See the top post of this discussion for background and motivation.

Proposed Changes

Methods

A pattern can start with an optional method followed by a space, as in GET /codesearch or GET codesearch.google.com/. A pattern with a method is used only to match requests with that method,
with one exception: the method GET also matches HEAD.
It is possible to have the same path pattern registered with different methods:

GET /foo
POST /foo

Wildcards

A pattern can include wildcard path elements of the form {name} or {name...}. For example, /b/{bucket}/o/{objectname...}. The name must be a valid Go identifier; that is, it must fully match the regular expression [_\pL][_\pL\p{Nd}]*.

These wildcards must be full path elements, meaning they must be preceded by a slash and followed by either a slash or the end of the string. For example, /b_{bucket} is not a valid pattern. Cases like these can be resolved by additional logic in the handler itself. Here, one can write /{bucketlink} and parse the actual bucket name from the value of bucketlink. Alternatively, using other routers will continue to be a good choice.

Normally a wildcard matches only a single path element, ending at the next literal slash (not %2F) in the request URL. If the ... is present, then the wildcard matches the remainder of the URL path, including slashes. (Therefore it is invalid for a ... wildcard to appear anywhere but at the end of a pattern.) Although wildcard matches occur against the escaped path, wildcard values are unescaped. For example, if a wildcard matches a%2Fb, its value is a/b.

There is one last, special wildcard: {$} matches only the end of the URL, allowing writing a pattern that ends in slash but does not match all extensions of that path. For example, the pattern /{$} matches the root page / but (unlike the pattern / today) does not match a request for /anythingelse.

Precedence

There is a single precedence rule: if two patterns overlap (have some requests in common), then the more specific pattern takes precedence. A pattern P1 is more specific than P2 if P1 matches a (strict) subset of P2’s requests; that is, if P2 matches all the requests of P1 and more. If neither is more specific, then the patterns conflict.

There is one exception to this rule, for backwards compatibility: if two patterns would otherwise conflict and one has a host while the other does not, then the pattern with the host takes precedence.

These Venn diagrams illustrate the relationships between two patterns P1 and P2 in terms of the requests they match:

relationships between two patterns

Here are some examples where one pattern is more specific than another:

  • example.com/ is more specific than / because the first matches only requests with host example.com, while the second matches any request.
  • GET / is more specific than / because the first matches only GET and HEAD requests while the second matches any request.
  • HEAD / is more specific than GET / because the first matches only HEAD requests while the second matches both GET and HEAD requests.
  • /b/{bucket}/o/default is more specific than /b/{bucket}/o/{noun} because the first matches only paths whose fourth element is the literal “default”, while in the second, the fourth element can be anything.

In contrast to the last example, the patterns /b/{bucket}/{verb}/default and /b/{bucket}/o/{noun} conflict with each other:

  • They overlap because both match the path /b/k/o/default.
  • The first is not more specific because it matches the path /b/k/a/default while the second doesn’t.
  • The second is not more specific because it matches the path /b/k/o/n while the first doesn’t.

Using specificity for matching is easy to describe and preserves the order-independence of the original ServeMux patterns. But it can be hard to see at a glance which of two patterns is the more specific, or why two patterns conflict. For that reason, the panic messages that are generated when conflicting patterns are registered will demonstrate the conflict by providing example paths, as in the previous paragraph.

The reference implementation for this proposal includes a DescribeRelationship method that explains how two patterns are related. That method is not a part of the proposal, but can help in understanding it. You can use it in the playground.

More Examples

This section illustrates the precedence rule for a complete set of routing patterns.

Say the following patterns are registered:

/item/
POST /item/{user}
/item/{user}
/item/{user}/{id}
/item/{$}
POST alt.com/item/{user}

In the examples that follow, the host in the request is example.com and the method is GET unless otherwise specified.

  1. “/item/jba” matches “/item/{user}”. The pattern "/item/" also matches, but "/item/{user}" is more specific.
  2. A POST to “/item/jba” matches “POST /item/{user}” because that pattern is more specific than "/item/{user}" due to its explicit method.
  3. A POST to “/item/jba/17” matches “/item/{user}/{id}”. As in the first case, the only other candidate is the less specific "/item/".
  4. “/item/” matches “/item/{$}” because it is more specific than "/item/".
  5. “/item/jba/17/line2” matches “/item/”. Patterns that end in a slash match entire subtrees, and no other more specific pattern matches.
  6. A POST request with host “alt.com” and path “/item/jba” matches “POST alt.com/item/{user}". That pattern is more specific than “POST /item/{user}” because it has a host.
  7. A GET request with host “alt.com” and path “/item/jba” matches “/item/{user}”. Although the pattern with a host is more specific, in this case it doesn’t match, because it specifies a different method.

API

To support this API, the net/http package adds two new methods to Request:

package http

func (*Request) PathValue(wildcardName string) string
func (*Request) SetPathValue(name, value string)

PathValue returns the part of the path associated with the wildcard in the matching pattern, or the empty string if there was no such wildcard in the matching pattern. (Note that a successful match can also be empty, for a "..." wildcard.)

SetPathValue sets the value of name to value, so that subsequent calls to PathValue(name) will return value.

Response Codes

If no pattern matches a request, ServeMux typically serves a 404 (Not Found). But if there is a pattern that matches with a different method, then it serves a 405 (Method Not Allowed) instead. This is not a breaking change, since patterns with methods did not previously exist.

Backwards Compatibility

As part of this proposal, we would change the way that ServeMux matches paths to use the escaped path (fixing #21955). That means that slashes and braces in an incoming URL would be escaped and so would not affect matching. We will provide the GODEBUG setting httpmuxgo121=1 to enable the old behavior.

More precisely: both patterns and paths are unescaped segment by segment. For example, "/%2F/%61", whether it is a pattern or an incoming path to be matched, is treated as having two segments containing "/" and "a". This is a breaking change for both patterns, which were not unescaped at all, and paths, which were unescaped in their entirety.

Performance

There are two situations where questions of performance arise: matching requests, and detecting conflicts during registration.

The reference implementation for this proposal matches requests about as fast as the current ServeMux on Julien Schmidt’s static benchmark. Faster routers exist, but there doesn't seem to be a good reason to try to match their speed. Evidence that routing time is not important comes from gorilla/mux, which is still quite popular despite being unmaintained until recently, and about 30 times slower than the standard ServeMux.

Using the specificity precedence rule, detecting conflicts when a pattern is registered seems to require checking all previously registered patterns in general. This makes registering a set of patterns quadratic in the worst case. Indexing the patterns as they are registered can significantly speed up the common case. See this comment for details. We would like to collect examples of large pattern sets (in the thousands of patterns) so we can make sure our indexing scheme works well on them.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions