Webhooks for Operations in MCP #523

pantanurag555 · 2025-05-13T23:07:43Z

pantanurag555
May 13, 2025

Pre-submission Checklist

I have verified this would not be more appropriate as a feature request in a specific repository
I have searched existing discussions to avoid duplicates

Your Idea

Motivation and Context

As MCP expands to support asynchronous communication (#491), there will arise scenarios wherein the async execution will take anywhere from a few hours to a few days (or even more in some cases). The current discussion revolves around modeling async tool executions as an asynchronous resource that can be tracked for progress updates and can be read from the server, once the async execution has been completed. The server will initially respond with an Accepted (HTTP Status 202) response and can then disconnect from the server (instead of maintaining open SSE connection with the client for an extended period of time). The onus then lies with the client to query the server and fetch the response.

In order to free up the client and reduce load when handling multiple async operations, the recommended approach for such async scenarios is often to offer support for webhooks (similar discussion: #266). Webhooks offer a lightweight approach that allow the server to send results to the client when the async execution for a tool has been completed. Webhooks can be hosted by the client or can point to a server that notifies the client and/or stores the response.

Webhook can act as a mechanism that notifies the client once the async execution/resource has completed before proceeding to retrieve it.
The pattern of async execution + webhooks can also prove useful in scenarios where storing the response might make sense prior to the client interacting with it.
Passing webhooks for async operations to different servers, that offer webhook support, would allow the client to disconnect after the 202 response and just track responses sent to the webhook instead of querying different servers for the async responses.

Example, user interacts with a client and tasks it with performing time series forecasting on stock market prices for select securities. The client would interact with a server to fetch the stock price history and another server to perform time series forecasting using ML algorithms. The time series forecasting tool on the server would be modeled as asynchronous. The final response could be a large dataset since the forecasting granularity might be in minutes and over the period of a few months. In such a scenario, it would make sense to provide the server with a webhook to another datastore to store the response before the user asks the client to further interact with the results.

Proposed Solution

Supporting webhooks should be a server-level capability even though this capability would only be supported for the asynchronous tools within the server. The server’s capability to support webhooks should be communicated to the client during initialization when capability negotiation takes place.

class ToolsCapability(BaseModel):
    """Capability for tools operations."""

    listChanged: bool | None = None
    """Whether this server supports notifications for changes to the tool list."""
    webhooksSupported: bool | None = None
    """Capability for transmitting tool responses to webhooks."""
    model_config = ConfigDict(extra="allow")

If the client wishes to make use of webhook(s) when interacting with a tool, it needs to add the webhooks to the call tool request. The client can base this decision on the tool annotations that can provide details about whether the tool is async, supports streaming etc- (#489).

Call tools request should be able to accept multiple webhooks from the client. Each webhook should provide details about its authentication mechanism (if any) and the credentials needed for them. The mentioned auth mechanisms mostly involve passing credential values in some form of header.

The current solution skips the JWT/JWKS strategy. For one, the server will need to expose a JWKS endpoint that will expose the public key. Currently OAuthClientMetadata has a field for jwks but calls out that these are not supported/are meant for future use (in the python sdk). Once the support for jwks is added to OAuthClientMetadata, the bearer auth strategy can be extended to support encrypting with a private key and exposing the public key at the jwks uri.

class CallToolRequest(Request[CallToolRequestParams, Literal["tools/call"]]):
    """Used by the client to invoke a tool provided by the server."""

    method: Literal["tools/call"]
    params: CallToolRequestParams
    webhooks: list[Webhook] | None = None

class Webhook(BaseModel):
    """Used to specify a webhook and authentication method to communicate with it"""

    url: str
    """Url to which the response will be transmitted"""
    authentication: AuthenticationInfo | None = None
    """Authentication required to communicate with the webhook"""

class AuthenticationInfo(BaseModel):
    """Used to specify authentication mechanism"""

    strategy: Literal["bearer", "apiKey", "basic", "customHeader"]
    """Authentication strategy that the server will follow"""
    credentials: str | None = None
    """
    Static credentials in the case of bearer, apiKey or basic.
    In case of basic and customHeader, this can also be a parsable JSON.
    """

If the server detects webhooks provided as part of a CallTools request and the tool is asynchronous, it will return an HTTP status of 202 to indicate that the request has been accepted and is being executed asynchronously. It will also communicate a unique identifier to identify the async execution/resource since there might be multiple async executions that the server is performing for the client in the background. The client should be able to recognize which response corresponds to which query when the result is transmitted to the webhook.

def call_tool(
  self,
  name: str,
  arguments: dict[str, Any],
  webhooks: Webhook[],
  context: Context[ServerSessionT, LifespanContextT] | None = None,
):
    tool = self.get_tool(name)
    if not tool:
        raise ToolError(f"Unknown tool: {name}")
    
    if tool.async_supported:
        """If async tool execution is modeled as an async resource"""
        resource = self._resource_manager.create_async_resource(
            name=tool.name,
            description=tool.description,
            operation_type="tool_execution",
        )
        if webhooks is not None and len(webhooks) > 0:
            """Start a background task to process the tool call and send results to webhooks"""
            task = asyncio.create_task(
                self._process_tool_call_with_webhooks(
                    tool, resource, arguments, context, webhooks
                )
            )
            await resource.start(task)
            """Return 202 Accepted immediately"""
            return {
                "type": "resource",
                "uri": str(resource.uri),
                "status": resource.status.value,
            }
        else:
            # Process asynchronously
            ...
    
    # Process synchronously
    ...


async def _process_tool_call_with_webhooks(
    self, 
    tool,
    resource,
    arguments, 
    context,
    webhooks,
):
    """
    Process a tool call asynchronously and send the results to the provided webhooks.
    """       
    try:
        result = await tool.run(arguments, context=context)
        await resource.complete(result)
    except Exception as e:
         # Mark the resource as failed
         await resource.fail(str(e))
         result = ToolError()
    
    # Send results to each webhook
    for webhook in webhooks:
        await self._send_to_webhook(webhook, result)

      
async def _send_to_webhook(self, webhook, result):
    """
    Send tool call results to a webhook.
    
    This would:
    1. Format the payload according to the webhook's requirements
    2. Send an HTTP request to the webhook URL with appropriate headers
    3. Handle authentication if specified in the webhook
    4. Implement retry logic for failed requests
    5. Log success/failure of webhook delivery
    """
    pass

The result transmitted to the webhook will be of type CallToolResult and will contain the resource uri to identify which async execution it pertains to.

The above solution is written with the idea that an async tool execution can be modeled as an async resource. There is some conversation on the discussion that indicates that async tool executions can instead be modeled as async+streaming. In that case, the server should stream the response to the webhooks. However, there would need to be some stream handling logic that is built-in at the webhook to be able to deal with the mergeable results.

Scope

jonathanhefner · 2025-05-14T01:24:21Z

jonathanhefner
May 14, 2025

The result transmitted to the webhook will be of type CallToolResult and will contain the resource uri to identify which async execution it pertains to.

The above solution is written with the idea that an async tool execution can be modeled as an async resource. There is some conversation on the discussion that indicates that async tool executions can instead be modeled as async+streaming. In that case, the server should stream the response to the webhooks. However, there would need to be some stream handling logic that is built-in at the webhook to be able to deal with the mergeable results.

Why transmit the results to the webhook? Why not store the results server side, and then call the webhook with a just reference (e.g. a task ID) that can be used to fetch the results?

5 replies

pantanurag555 May 14, 2025
Author

There are 2 benefits, that I can think of, of transmitting the results instead of a reference:

The server can avoid storing responses and instead consider the async tool execution completed once the result is transmitted to the webhook. Even in case of modeling async tool execution as an async resource, there should be some expiry period that ensures that the resource gets cleaned up. Otherwise we are just burdening the server by storing responses to all async tool executions it performs.
It ensures that if the webhook is hosted/can be accessed by the client, the only interaction that needs to happen after the async tool execution is the communication between the client and the webhook. If the webhook only receives the reference, the client will need to further interact with the server to fetch the results. The use case where the webhook stores the responses in a data store also becomes a bit more tedious.

However, transmitting the reference does provide an extra layer of security wherein the client needs to connect back/communicate with the server to fetch the results. The server can ensure that the client has the correct auth to access the results. I am not partial to either approach and would like to community to weigh in on what makes sense.

jonathanhefner May 15, 2025

The server can avoid storing responses and instead consider the async tool execution completed once the result is transmitted to the webhook. Even in case of modeling async tool execution as an async resource, there should be some expiry period that ensures that the resource gets cleaned up. Otherwise we are just burdening the server by storing responses to all async tool executions it performs.

Consider, though, what if a webhook is down or unreachable? The server would need to store the results until the webhook becomes reachable again or the results expire. In other words, there would still be a need for server-side storage infrastructure.

There's also the possibility that a webhook is slow. It could become a bottleneck for the server and cause performance degradation. (Possibly a DDoS vector?)

Storing results in infrastructure that the server host controls seems more robust.

I agree that there should be an expiry period — that should help limit storage costs. (You probably saw, but I talked about an "expiry interval" in my proposal.)

It ensures that if the webhook is hosted/can be accessed by the client, the only interaction that needs to happen after the async tool execution is the communication between the client and the webhook. If the webhook only receives the reference, the client will need to further interact with the server to fetch the results. The use case where the webhook stores the responses in a data store also becomes a bit more tedious.

I'm not sure I follow. Is the benefit that the client no longer needs to be able to reach the server?

If the server sends a request (e.g. a sampling request), then the client would still need to reach the server in order to respond. Of course, not every long-running task will send such a request, but there is still the possibility. So, for the sake of implementers, I think the spec should set the expectation that a client can reach a server while a task is ongoing.

pantanurag555 May 20, 2025
Author

Consider, though, what if a webhook is down or unreachable? The server would need to store the results until the webhook becomes reachable again or the results expire. In other words, there would still be a need for server-side storage infrastructure.

This is not an issue that should concern the server. The client should specify a webhook that is resilient to downtime. When we think in terms of the webhook propped up in a cloud architecture, it is less likely that the server runs into these issues when trying to relay the response. However, retries and timeouts should be configurable on the server end to avoid endless retrying.

Server-side storage should not be a fallback. Webhooks will act a replacement for the client when receiving a response. If the client goes down during communication or is slow for some reason, the server should not need to store the response to retry at a later stage.

If the server sends a request (e.g. a sampling request), then the client would still need to reach the server in order to respond. Of course, not every long-running task will send such a request, but there is still the possibility. So, for the sake of implementers, I think the spec should set the expectation that a client can reach a server while a task is ongoing.

That could be a scenario that may arise and I agree that the client should have the ability to communicate with the server 'if required'. However unless the need arises to further communicate with the server due to a sampling request, the client will not have the need to interact further just to obtain the results for its previous request. The response would have been transmitted to the webhook and would be available for use as soon as the client communicates with it.

The above scenarios make much more sense in a cloud architecture setup. If the user makes use of a client that is linked with their cloud service (Amazon Q, Microsoft Copilot etc-), the webhook would ideally be hosted in their personal account/enterprise account/service account. The user may shut down their local machines but the accounts along with the webhooks will always be up and running. Say the user opens their local machine and uses the local client. The local client connects to a server and uses it to trigger a long running task. The user then proceeds to shut down their local client/machine. The long running task will complete on the server and the results will be transmitted to the webhook. Now the user logs back in after some time. As soon as the local client reconnects to the upstream service (automatically happens on start up), it will have the response waiting for it. This takes away the need for the client to keep pinging the server from time to time about the task status, having to deal with streamed responses or merging the results.

jonathanhefner May 21, 2025

This is not an issue that should concern the server. The client should specify a webhook that is resilient to downtime. When we think in terms of the webhook propped up in a cloud architecture, it is less likely that the server runs into these issues when trying to relay the response. However, retries and timeouts should be configurable on the server end to avoid endless retrying.

Server-side storage should not be a fallback. Webhooks will act a replacement for the client when receiving a response. If the client goes down during communication or is slow for some reason, the server should not need to store the response to retry at a later stage.

When a webhook is unreachable, what happens to the results data in between retries?

The above scenarios make much more sense in a cloud architecture setup. If the user makes use of a client that is linked with their cloud service (Amazon Q, Microsoft Copilot etc-), the webhook would ideally be hosted in their personal account/enterprise account/service account. The user may shut down their local machines but the accounts along with the webhooks will always be up and running. Say the user opens their local machine and uses the local client. The local client connects to a server and uses it to trigger a long running task. The user then proceeds to shut down their local client/machine. The long running task will complete on the server and the results will be transmitted to the webhook. Now the user logs back in after some time. As soon as the local client reconnects to the upstream service (automatically happens on start up), it will have the response waiting for it. This takes away the need for the client to keep pinging the server from time to time about the task status, having to deal with streamed responses or merging the results.

You seem to be comparing sending results to a webhook versus having no webhook at all. My assertion is that webhooks could indeed be a nice feature, but sending results to a webook is not worth the complexity cost. Instead, we should just send an opaque reference for the results.

In your scenario, all three entities have significant added complexity:

The server has to be capable of both storing results and sending results to a webhook, including handling additional security and reliability concerns.
The webhook service has to be capable of receiving, storing, and serving results.
The client has to be capable of both getting results from the server in the normal way, and fetching results from the webhook service.

I envision webhooks as an extra, optional notification channel. The client and server function as they normally would, including support for disconnects and resuming streams. But, when the server emits a result, if the client has provided a webhook and is currently not connected, the server would send an opaque reference for the result to the webhook. The webhook could then notify the client using an appropriate protocol.

By the way, it would still be possible to achieve the outcome you describe with the opaque reference approach. To do so, the client would provide credentials to the webhook service such that the webhook service could connect to the server on the client's behalf. Then, when the webhook receives an opaque reference, the webhook service would get the result from the server just as the client would, and store it for the client to fetch later.

pantanurag555 May 28, 2025
Author

What you are saying makes sense. I have been giving more thought to the idea of supporting webhooks in MCP servers. Rather than having webhook transmission work at the messaging layer like I had mentioned at the start of the discussion, I think it would make more sense to integrate it at the transport layer.

Additionally, rather than just working atop async tool calls like suggested above, sending responses/references to webhooks would be supported for all tool calls. This would make it simpler for the client to make use of webhooks across tools.

I understand your vision for webhooks as an extra optional notification channel. I would like to leave that choice upto the server owner. If they decide that the webhook should receive an opaque reference for the result, they should be able to transmit it. Otherwise the server should also be capable of transmitting the result of tool call to a webhook in a single message or be able to stream it over multiple messages. The result(s) streamed should be as identical as possible to how they would have been had they been directly sent to the client.

To allow further flexibility to the servers:

Webhook support should still be a server capability. The owner should be able to decide whether transmission to webhook makes sense for their tools or not.
Call tool implementations should have information about whether the client has passed a webhook or not. This would help with the async/streaming case discussed above, wherein the tool might decide to return an opaque reference to the webhook.

Streamable http should be the first transport to add support for webhooks. Stdio and sse do not seem to be right choices for webhook support in my opinion, but support can be expanded to them later.

seuros · 2025-05-14T19:16:20Z

seuros
May 14, 2025

The MCP protocol is fundamentally session-based. When a task spans several hours or even days, it’s no longer just a "task"—it’s effectively a long-running job.

Introducing webhooks on the client side implies that the client must expose the IP of the server receiving the callback. This isn’t always feasible, especially when servers are behind load balancers or services like Cloudflare.

A more robust approach is to have the server return a token representing the long-running task. That token can be used in any future session to query status, retrieve results, or check progress. Tooling can easily be built around this mechanism.

Additionally, the updated MCP specification makes it clear that SSE is no longer a requirement. Streamable HTTP with resumability is now the preferred method, offering both flexibility and better alignment with modern HTTP patterns.

Webhooks, much like batching, are appealing in theory but rarely practical. This is precisely why batching was deprecated in the 2025-03 specification and is planned for removal in the next release.

Moreover, requiring clients to host webhook endpoints assumes they have a stable IP or DNS and are always online. But what happens if the MCP client is installed on a mobile device? What if the server completes the task while the device is offline?

These are critical edge cases that webhooks fail to handle gracefully.

1 reply

pantanurag555 May 14, 2025
Author

I think there are a few different arguments here so I will address them in bullet points.

I don't see a reason why a client cannot act as a server as well. Agent-to-agent is/can be a supported use case in MCP.
I also think there are scenarios where the data can be stored in a different server that the webhook points to, rather than being routed back to just the client. I don't think the idea is that it is only the client that can act as host of the webhook. Instead there can be a different infra that hosts the webhook, which the client can interact with. This would make sense in the case of the edge cases you mentioned. A portable client that can be disconnected from the internet for long periods of time should not be used as a webhook.
The idea of webhook works with streamable HTTP as well as SSE. Even in the case of an HTTP session that can be resumed using a session id, the onus still lies with the client to continue polling while the server needs to store the response/remaining streaming notifications. The way I see webhooks is that they offer support for a push architecture instead of only relying on a pull architecture. It has the ability to free up both the server from the load of storing results of async tools as well as the client from maintaining track of multiple async resources/responses it has received.

pantanurag555 · 2025-05-28T18:54:42Z

pantanurag555
May 28, 2025
Author

I have given more thought to the idea of supporting webhooks in MCP servers. Rather than having webhook transmission work at the messaging layer like I had mentioned at the start of the discussion, I think it makes more sense to integrate it at the transport layer. Streamable HTTP would ideally be the first transport that is fitted for webhook support. Revised spec changes:

class ToolsCapability(BaseModel):
    """Capability for tools operations."""

    listChanged: bool | None = None
    """Whether this server supports notifications for changes to the tool list."""
    webhooksSupported: bool | None = None
    """Capability for transmitting tool responses to webhooks."""
    model_config = ConfigDict(extra="allow")

class AuthenticationInfo(BaseModel):
    """Used to specify authentication mechanism"""

    strategy: Literal["bearer", "apiKey", "basic", "customHeader"]
    """Authentication strategy that the server will follow"""
    credentials: str | None = None
    """
    Static credentials in the case of bearer, apiKey or basic.
    In case of basic and customHeader, this can also be a parsable JSON.
    """

class Webhook(BaseModel):
    """Used to specify a webhook and authentication method to communicate with it"""

    url: str
    """Url to which the response will be transmitted"""
    authentication: AuthenticationInfo | None = None
    """Authentication required to communicate with the webhook"""

class CallToolRequestParams(RequestParams):
    """Parameters for calling a tool."""

    name: str
    arguments: dict[str, Any] | None = None
    webhooks: list[Webhook] | None = None
    model_config = ConfigDict(extra="allow")

Servers will still treat webhooks as a capability and will have the ability to support or not support webhooks. This decision could be based on whether the server comprises of asynchronous tools, whether the server wants to establish connections to the webhooks uris provided by the client (security concerns), if the server wants to risk additional latency of transmitting results to webhook (reliability) etc-.

If a server supports webhook capability, webhooks will be supported for all tools rather than just being restricted to asynchronous tools. The client will have the option to pass webhooks as part of the CallToolsRequest. Any request with webhooks will have the final results transmitted to webhooks instead of back to the client. The client will receive an acknowledgement response once the tool call with webhooks has been received by the server. The response will be 200 instead of 202 (202 is only returned by the server when the client sends a notification/response).

The tool call implementations should be modifiable depending on whether the tool call consists of a webhook or not.
For example, an asynchronous tool exists on a server that supports webhooks. The tool should be able to do the following:

When webhook provided by client: Send tool call acknowledgement to the client. Transmit a reference to a resource/streaming task when the tool call is completed.
When webhook is NOT provided by client: Stream/return result(s) to the client.

In order to do this, the tool should be able to access whether a webhook has been passed by the client or not in its implementation. This can be achieved by using the request context.

@dataclass
class RequestContext(Generic[SessionT, LifespanContextT, RequestT]):
    request_id: RequestId
    meta: RequestParams.Meta | None
    session: SessionT
    lifespan_context: LifespanContextT
    has_webhook: bool = False
    request: RequestT | None = None

    def call_tool(self):
        def decorator(
            func: Callable[
                ...,
                Awaitable[
                    Iterable[
                        types.TextContent | types.ImageContent | types.EmbeddedResource
                    ]
                ],
            ],
        ):
            logger.debug("Registering handler for CallToolRequest")

            async def handler(req: types.CallToolRequest):
                try:
                    if req.params.webhooks is not None and len(req.params.webhooks) > 0:
                        self.request_context.has_webhook = True
                    results = await func(req.params.name, (req.params.arguments or {}))
                    return types.ServerResult(
                        types.CallToolResult(content=list(results), isError=False)
                    )
                except Exception as e:
                    return types.ServerResult(
                        types.CallToolResult(
                            content=[types.TextContent(type="text", text=str(e))],
                            isError=True,
                        )
                    )
            self.request_handlers[types.CallToolRequest] = handler
            return func
        return decorator

0 replies

AchintyaAshok · 2025-05-29T16:28:10Z

AchintyaAshok
May 29, 2025

Authenticating webhooks is at tricky proposition and additionally it looks like Authentication itself is moving towards some centralized interfacing mechanism in the spec. Can I suggest that an initial impl for a WH callback would be to a publicly callable webhook endpoint without guarantees of authorization handling on the part of the client/mcp server?

Another way to simplify this is to have the webhook URL provider (ex. some service outside the client), provide its own authentication criteria on the URL such as a token. The responsibility then lies on that provider to validate that token / reference.

Example flow:

Client calls Webhook Server to generate an endpoint to receive calls ex {url}?id={some_uuid} (wh endpoint). [Optional Step, the endpoint could also be static]
The Webhook Server generates the UUID to track the response handling.
Client calls MCP server with the webhook endpoint
MCP Server calls Webhook Server using webhook endpoint

0 replies

Webhooks for Operations in MCP #523

Uh oh!

Uh oh!

pantanurag555 May 13, 2025

Pre-submission Checklist

Your Idea

Motivation and Context

Proposed Solution

Scope

Replies: 4 comments · 6 replies

Uh oh!

jonathanhefner May 14, 2025

Uh oh!

pantanurag555 May 14, 2025 Author

Uh oh!

jonathanhefner May 15, 2025

Uh oh!

Uh oh!

pantanurag555 May 20, 2025 Author

Uh oh!

jonathanhefner May 21, 2025

Uh oh!

Uh oh!

pantanurag555 May 28, 2025 Author

Uh oh!

seuros May 14, 2025

Uh oh!

pantanurag555 May 14, 2025 Author

Uh oh!

pantanurag555 May 28, 2025 Author

Uh oh!

Uh oh!

AchintyaAshok May 29, 2025

pantanurag555
May 13, 2025

Replies: 4 comments 6 replies

jonathanhefner
May 14, 2025

pantanurag555 May 14, 2025
Author

pantanurag555 May 20, 2025
Author

pantanurag555 May 28, 2025
Author

seuros
May 14, 2025

pantanurag555 May 14, 2025
Author

pantanurag555
May 28, 2025
Author

AchintyaAshok
May 29, 2025