You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/explainer.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,12 +65,12 @@ There are several advantages to using the web to connect agents to services:
65
65
66
66
WebMCP is a proposal for a web API that enables web pages to provide agent-specific paths in their UI. With WebMCP, agent-service interaction takes place _via app-controlled UI_, providing a shared context available to app, agent, and user. In contrast to backend integrations, WebMCP tools are available to an agent only once it has loaded a page and they execute on the client. Page content and actuation remain available to the agent (and the user) but the agent also has access to tools which it can use to achieve its goal more directly.
67
67
68
-

68
+

69
69
70
70
In contrast, in a backend integration, the agent-service interaction takes place directly, without an associated UI. If
71
71
a UI is required it must be provided by the agent itself or somehow connected to an existing UI manually:
72
72
73
-
The expected flow using browser agents and Script Tools:
73
+
The expected flow using browser agents and WebMCP:
74
74
75
75

76
76
@@ -91,7 +91,7 @@ The expected flow using browser agents and Script Tools:
91
91
92
92
## Use Cases
93
93
94
-
The use cases for script tools are ones in which the user is collaborating with the agent, rather than completely
94
+
The use cases for WebMCP are ones in which the user is collaborating with the agent, rather than completely
95
95
delegating their goal to it. They can also be helpful where interfaces are highly specific or complicated.
96
96
97
97
### Example - Creative
@@ -105,7 +105,7 @@ which to choose from so she asks her browser agent for help._
105
105
**Jen**: Show me templates that are spring themed and that prominently feature the date and time. They should be on a
106
106
white background so I don't have to print in color.
107
107
108
-
_The current document has registered a script tool that the agent notices may be relevant to this query:_
108
+
_The current document has registered a WebMCP tool that the agent notices may be relevant to this query:_
109
109
110
110
```js
111
111
/**
@@ -180,7 +180,7 @@ _The agent takes this action using a sequence of tool calls which might look som
180
180
*`AddPage("DUPLICATE")`
181
181
*`EditDesign("Change the call-to-action text to 'Come for the bargains, stay for the cookies'")`
182
182
183
-
_Jen now has 3 versions of the same yard sale flyer. Easely implements these script tools using AI-based techinques on
183
+
_Jen now has 3 versions of the same yard sale flyer. Easely implements these WebMCP tools using AI-based techinques on
184
184
their backend to allow a natural language interface. Additionally, the UI presents these changes to Jen as an easily
185
185
reversible batch of "uncommitted" changes, allowing her to easily review the agent's actions and make changes or undo as
186
186
necessary. While the site could also implement a chat interface to expose this functionality with their own agent, the
@@ -414,7 +414,7 @@ quickly review the agent's actions and accept/modify/reject them._
414
414
415
415
## Assumptions
416
416
417
-
* For many sites wanting to integrate with agents quickly - augmenting their existing UI with script tools will be
417
+
* For many sites wanting to integrate with agents quickly - augmenting their existing UI with WebMCP tools will be
418
418
easier vs. backend integration
419
419
* Agents will perform quicker and more successfully with specific tools compared to using a human interface.
420
420
* Users might use an agent for a direct action query (e.g. “create a 30 minute meeting with Pat at 3:00pm”), complex
@@ -492,7 +492,7 @@ for remote execution.
492
492
493
493
### WebMCP (MCP-B)
494
494
495
-
[MCP-B](https://mcp-b.ai/), or Model Context Protocol for the Browser, is an open source project found on GitHub [here](https://github.com/MiguelsPizza/WebMCP) and has much the same motivation and solution as described in this proposal. MCP-B's underlying protocol, also named WebMCP, extends MCP with tab transports that allow in-page communicate between a website's MCP server and any client in the same tab. It also extends MCP with extension transports that use Chromium's runtime messaging to make a website's MCP server available to other extension components within the browser (background, sidebar, popup), and to other external MCP clients running on the same machine. MCP-B enables tools from different sites to work together, and for sites to cache tools so that they are discoverable even if the browser isn't currently navigated to the site.
495
+
[MCP-B](https://mcp-b.ai/), or Model Context Protocol for the Browser, is an open source project found on GitHub [here](https://github.com/MiguelsPizza/WebMCP) and has much the same motivation and solution as described in this proposal. MCP-B's underlying protocol, also named WebMCP, extends MCP with tab transports that allow in-page communication between a website's MCP server and any client in the same tab. It also extends MCP with extension transports that use Chromium's runtime messaging to make a website's MCP server available to other extension components within the browser (background, sidebar, popup), and to other external MCP clients running on the same machine. MCP-B enables tools from different sites to work together, and for sites to cache tools so that they are discoverable even if the browser isn't currently navigated to the site.
Copy file name to clipboardExpand all lines: docs/proposal.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,11 +14,11 @@ Only a top-level browsing context, such as a browser tab can be a model context
14
14
* A natural language description of the parameter
15
15
* The expected type (e.g. Number, String, Enum, etc)
16
16
* Any restrictions on the parameter (e.g. integers greater than 0)
17
-
* A JS callback function that implementings the tool and returns a result
17
+
* A JS callback function that implements the tool and returns a result
18
18
19
-
When an agent that is connected to the page sends a tool call, the JavaScript callback is invoked, where the page can handle the tool call and respond to the agent. Simple applications can handle tool calls entirely in page script, but more complex applications may choose to delegate computationally heavy operations to workers and respond to the agent asynchronously.
19
+
When an agent that is connected to the page sends a tool call, the JavaScript callback is invoked, where the page can handle the tool call and respond to the agent. The function can be asynchronous and return a promise, in which case the agent will receive the result once the promise is resolved. Simple applications can handle tool calls entirely in page script, but more complex applications may choose to delegate computationally heavy operations to workers and respond to the agent asynchronously.
20
20
21
-
Handling tool cools in the main thread with the option of delegating to workers serves a few purposes:
21
+
Handling tool calls in the main thread with the option of delegating to workers serves a few purposes:
22
22
23
23
- Ensures tool calls run one at a time and sequentially.
24
24
- The page can update UI to reflect state changes performed by tools.
@@ -68,7 +68,7 @@ window.agent.provideContext({
68
68
});
69
69
```
70
70
71
-
The `provideContext` method can be called multiple times. Subsequent calls clear any pre-existing tools and other context before registering the new ones. This is useful for single-page web apps that frequently change UI state and could benefit from presenting different tools depending on which state the UI is currently in.
71
+
The `provideContext` method can be called multiple times. Subsequent calls clear any pre-existing tools and other context before registering the new ones. This is useful for single-page web apps that frequently change UI state and could benefit from presenting different tools depending on which state the UI is currently in. For a list of tools passed to `provideContext`, each tool name in the list is expected to be unique.
72
72
73
73
**Advantages:**
74
74
@@ -132,7 +132,7 @@ Tool calls are handled as events. Since event handler functions can't respond to
132
132
133
133
### Recommendation
134
134
135
-
A **hybrid** approach of both of the examples above is recommended as this would make it easy for web developers to get started adding tools to their page, while leaving open the possibility of manifest-based approaches in the future. To implement this hybrid approach, a `"toolcall"` event is dispatched on every incoming tool call _before_ executing the tool's `execute` function. The event handler can handle the tool call by calling the event's `preventDefault()` method, and then responding to the agent with `respondWith()` as shown above. If the event handle does not call `preventDefault()` then the browser's default behavior for tool calls will occur. The `execute` function for the requested tool is called. If a tool with the requested name does not exist, then the browser responds to the agent with an error.
135
+
A **hybrid** approach of both of the examples above is recommended as this would make it easy for web developers to get started adding tools to their page, while leaving open the possibility of manifest-based approaches in the future. To implement this hybrid approach, a `"toolcall"` event is dispatched on every incoming tool call _before_ executing the tool's `execute` function. The event handler can handle the tool call by calling the event's `preventDefault()` method, and then responding to the agent with `respondWith()` as shown above. If the event handler does not call `preventDefault()` then the browser's default behavior for tool calls will occur. The `execute` function for the requested tool is called. If a tool with the requested name does not exist, then the browser responds to the agent with an error.
136
136
137
137
## Example of WebMCP API usage
138
138
@@ -259,4 +259,4 @@ For long-running, batched, or expensive tool calls, we expect web developers wil
259
259
260
260
## Acknowledgments
261
261
262
-
Many thanks to [Alex Nahas](https://github.com/MiguelsPizza) for sharing related [implementationexperience](https://github.com/MiguelsPizza/WebMCP).
262
+
Many thanks to [Alex Nahas](https://github.com/MiguelsPizza)and [Jason McGhee](https://github.com/jasonjmcghee/)for sharing related [implementation](https://github.com/MiguelsPizza/WebMCP)[experience](https://github.com/jasonjmcghee/WebMCP).
0 commit comments