-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Update caching docs #1818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Update caching docs #1818
Changes from 3 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
e422f37
Expand caching page to address performance in general.
mandiwise 5d193eb
Add redirect for old /learn/caching/ route.
mandiwise bb8fa6f
Disambiguate categories of persisted queries.
mandiwise 7533488
Provide additional comment on N+1 solutions.
mandiwise 6f7fa7b
Split content onto separate caching and performance pages.
mandiwise 28a67c6
Apply suggestions from code review
mandiwise 955aab0
Fix HTTP spec URL.
mandiwise 19cedbd
Modify N+1 example and explanation.
mandiwise ddaf834
Merge branch 'source' into update-caching-docs
benjie File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Performance | ||
|
||
<p className="learn-subtitle">Optimize the execution and delivery of GraphQL responses</p> | ||
|
||
At first glance, GraphQL requests may seem challenging to cache given that the API is served through a single endpoint and you may not know in advance what fields a client will include in an operation. | ||
|
||
In practice, however, GraphQL is as cacheable as any API that enables parameterized requests, such as a REST API that allows clients to specify different query parameters for a particular endpoint. There are many ways to optimize GraphQL requests on the client and server sides, as well as in the transport layer, and different GraphQL server and client libraries will often have common caching features built directly into them. | ||
|
||
On this page, we'll explore several different tactics that can be leveraged in GraphQL clients and servers to optimize how data is fetched from the API. | ||
|
||
## Globally unique IDs | ||
|
||
In an endpoint-based API, clients can use [HTTP caching](https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching) to avoid refetching resources and to identify when two resources are the same. The URL in these APIs is a _globally unique identifier_ that the client can leverage to build a cache. | ||
|
||
In GraphQL, there's no URL-like primitive that provides this globally unique identifier for a given object. Hence, it's a best practice for the API to expose such an identifier for clients to use as a prerequisite for certain types of caching. | ||
|
||
### Standardize how objects are identified in a schema | ||
|
||
One possible pattern for this is reserving a field, like `id`, to be a globally unique identifier. The example schema used throughout these docs uses this approach: | ||
|
||
```graphql | ||
# { "graphiql": true } | ||
query { | ||
starship(id: "3003") { | ||
id | ||
name | ||
} | ||
droid(id: "2001") { | ||
id | ||
name | ||
friends { | ||
id | ||
name | ||
} | ||
} | ||
} | ||
``` | ||
|
||
This is a powerful tool for client developers. In the same way that the URLs of a resource-based API provide a globally unique key, the `id` field in this system provides a globally unique key. | ||
|
||
If the backend uses something like UUIDs for identifiers, then exposing this globally unique ID may be very straightforward! If the backend doesn't have a globally unique ID for every object already, the GraphQL layer might have to construct one. Oftentimes, that's as simple as appending the name of the type to the ID and using that as the identifier. The server might then make that ID opaque by base64-encoding it. | ||
|
||
Optionally, this ID can then be used to work with the `node` pattern when using [global object identification](/learn/global-object-identification). | ||
|
||
### Compatibility with existing APIs | ||
|
||
One concern with using the `id` field for this purpose is how a client using the GraphQL API would work with existing APIs. For example, if our existing API accepted a type-specific ID, but our GraphQL API uses globally unique IDs, then using both at once can be tricky. | ||
|
||
In these cases, the GraphQL API can expose the previous API's IDs in a separate field. This gives us the best of both worlds: | ||
|
||
- GraphQL clients can continue to rely on a consistent mechanism to get a globally unique ID. | ||
- Clients that need to work with our previous API can also fetch `previousApiId` from the object, and use that. | ||
|
||
### Alternatives | ||
|
||
While globally unique IDs have proven to be a powerful pattern in the past, they are not the only pattern that can be used, nor are they right for every situation. The critical functionality that the client needs is the ability to derive a globally unique identifier for their caching. While having the server derive that ID simplifies the client, the client can also derive the identifier. This could be as simple as combining the type of the object (queried with `__typename`) with some type-unique identifier. | ||
|
||
Additionally, if replacing an existing API with a GraphQL API, then it may be confusing if all of the fields in GraphQL are the same **except** `id`, which changed to be globally unique. This would be another reason why one might choose not to use `id` as the globally unique field. | ||
|
||
## `GET` requests for queries | ||
|
||
GraphQL implementations that adhere to the [GraphQL over HTTP specification](https://github.com/graphql/graphql-over-http/blob/main/spec/GraphQLOverHTTP.md) will support the `POST` HTTP method by default, but may also support `GET` requests for query operations. | ||
|
||
Using `GET` can improve query performance because requests made with this HTTP method are typically considered cacheable by default and can help facilitate HTTP caching or the use of a content delivery network (CDN) when caching-related headers are provided in the server response. | ||
|
||
However, because browsers and CDNs impose size limits on URLs, it may not be possible to send a large document for complex operations in the query string of the URL. Using _persisted queries_, either in the form of _trusted documents_ or _automatic persisted queries_, will allow the client to send a hash of the query instead, and the server can look up the full version of the document by looking up the hash in a server-side store before validating and executing the operation. | ||
|
||
Sending hashed queries instead of their plaintext versions has the additional benefit of reducing the amount of data sent by the client in the network request. | ||
|
||
## The N+1 Problem | ||
|
||
GraphQL is designed in a way that allows you to write clean code on the server, where every field on every type has a focused single-purpose function for resolving that value. However, without additional consideration, a naive GraphQL service could be very "chatty" or repeatedly load data from your databases. | ||
|
||
Consider the following query—to fetch a hero along with a list of their friends, we can imagine that as the field resolvers execute there will be one request to the underlying data source to get the character object, and then three subsequent requests to load the friends' character objects: | ||
|
||
```graphql | ||
# { "graphiql": true } | ||
query HeroWithFriends { | ||
# 1 request for the hero | ||
hero { | ||
name | ||
# 3 more requests for each friend | ||
friends { | ||
name | ||
} | ||
} | ||
} | ||
``` | ||
|
||
This is known as the N+1 problem, where the first request to an underlying data source leads to N subsequent requests to resolve the data for all of the requested fields. | ||
|
||
mandiwise marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This is commonly solved by a batching technique, where multiple requests for data from a backend are collected over a short period and then dispatched in a single request to an underlying database or microservice by using a tool like Facebook's [DataLoader](https://github.com/facebook/dataloader). | ||
|
||
## Demand control | ||
|
||
Depending on how a GraphQL schema has been designed, it may be possible for clients to request highly complex operations that place excessive load on the underlying data sources during execution. These kinds of operations may be sent inadvertently by an known client, but they may also be sent by malicious actors. | ||
|
||
Certain demand control mechanisms can help guard a GraphQL API against these operations, such as paginating list fields, limiting operation depth and breadth, and query complexity analysis. [You can read more about demand control on the Security page](/learn/security/#demand-control). | ||
|
||
## JSON (with GZIP) | ||
|
||
GraphQL services typically respond using JSON even though the GraphQL spec [does not require it](http://spec.graphql.org/draft/#sec-Serialization-Format). JSON may seem like an odd choice for an API layer promising better network performance, however, because it is mostly text it compresses exceptionally well with GZIP. | ||
mandiwise marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
It's encouraged that any production GraphQL services enable GZIP and encourage their clients to send the header: | ||
|
||
```text | ||
Accept-Encoding: gzip | ||
``` | ||
|
||
JSON is also very familiar to client and API developers, and is easy to read and debug. In fact, the GraphQL syntax is partly inspired by JSON syntax. | ||
|
||
## Performance monitoring | ||
|
||
Monitoring a GraphQL API over time can provide insight into how certain operations impact API performance and help you determine what adjustments to make to maintain its health. For example, you may find that certain fields take a long time to resolve due to under-optimized requests to a backing data source, or you may find that other fields routinely raise errors during execution. | ||
|
||
Observability tooling can provide insight into where bottlenecks exist in the execution of certain GraphQL operations by allowing you to instrument the API service to collect metrics, traces, and logs during requests. For example, [OpenTelemetry](https://opentelemetry.io/) provides a suite of vendor-agnostic tools that can be used in many different languages to support instrumentation of GraphQL APIs. | ||
mandiwise marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Recap | ||
|
||
To recap these recommendations for improving GraphQL API performance: | ||
|
||
- Defining a globally unique ID field for an Object type can facilitate various types of caching | ||
- Using `GET` for GraphQL query operations can support HTTP caching and CDN usage, particularly when used in conjunction with hashed query documents | ||
- Because an operation's selection set can express relationships between different kinds of objects, the N+1 problem can be mitigated during field execution by batching and caching requests to underlying data sources | ||
- Field pagination, limiting operation depth and breadth, and rate-limiting API requests can help prevent individual GraphQL operations from placing excessive load on server resources | ||
- GZIP can be used to compress the size of GraphQL JSON-formatted responses when servers and clients support it | ||
- The overall health of a GraphQL API can be maintained over time by using performance monitoring tools like OpenTelemetry to collect metrics, logs, and traces related to request execution |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.