Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/advanced/multi-round-trip.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ That's the whole protocol. Every leg is an ordinary request from the client to t

## The server side

The high-level `@mcp.tool()` decorator has no sugar for this yet. Today you write it on the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:
On `@mcp.tool()` you rarely build this by hand: declare a dependency that asks the user and the SDK returns the `InputRequiredResult` for you - that form is the **[Dependencies](../tutorial/dependencies.md)** tutorial. The manual form is the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:

```python title="server.py" hl_lines="44-47"
--8<-- "docs_src/mrtr/tutorial001.py"
Expand Down Expand Up @@ -93,6 +93,6 @@ Drop to the underlying session, where `allow_input_required=True` hands you the
* `input_requests` is what it needs. `request_state` is an opaque resume token only the server reads.
* `Client` runs the retry loop for you: register `elicitation_callback` / `sampling_callback` / `list_roots_callback` and `call_tool` returns a plain `CallToolResult`. `input_required_max_rounds` (default 10) bounds it.
* To inspect or persist rounds, use `client.session.call_tool(..., allow_input_required=True)` and own the `while isinstance(result, InputRequiredResult)` loop yourself.
* The server side is the **low-level** `Server` only; `@mcp.tool()` has no sugar for this yet.
* On `@mcp.tool()`, a dependency that asks the user produces this result for you (**[Dependencies](../tutorial/dependencies.md)**); the **low-level** `Server` is the manual form.

This is the mechanism that replaces server-initiated sampling and the rest of the push-style back-channel; see **[Deprecated features](deprecated.md)**.
2 changes: 2 additions & 0 deletions docs/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -786,6 +786,8 @@ Positional calls (`await ctx.info("hello")`) are unaffected.

`Context.elicit()` (and `elicit_with_validation()`) now render the schema first and validate each property against the spec's `PrimitiveSchemaDefinition`, raising `TypeError` at the call site for anything outside it. `Optional[T]` fields render as `{"type": ...}` with the field omitted from `required` (previously the non-spec `anyOf` shape). A bare `list[str]` field is rejected because it renders without the required enum items; use `list[Literal[...]]` or `list[str]` with `json_schema_extra` supplying the items. Unions of multiple primitives (e.g. `int | str`) and nested models are rejected.

A schema-mismatched *accepted* answer also fails differently: the call now raises `ValueError` with a stable message ("Received an accepted elicitation whose content does not match the requested schema") instead of letting pydantic's `ValidationError` escape with its internals. Code that caught `ValidationError` around `ctx.elicit()` should catch `ValueError` (or rely on the tool's error result).

### Replace `RootModel` by union types with `TypeAdapter` validation

The following union types are no longer `RootModel` subclasses:
Expand Down
17 changes: 16 additions & 1 deletion docs/tutorial/dependencies.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,11 +116,26 @@

That's the right default for a precondition: no answer, no order. When declining is an outcome your tool wants to handle - skip the backorder but still suggest another title - annotate `ElicitationResult[Backorder]` instead and the tool receives the full accept/decline/cancel outcome to branch on. **[Elicitation](elicitation.md)** shows that form, and everything else about asking: the schema rules, the three answers, the client's side of the conversation.

!!! info
The framework picks the question's transport from the negotiated protocol version; the code
above is identical on both. On **2026-07-28** and later the question rides inside a
multi-round-trip `tools/call` - the server returns it, the client's `elicitation_callback`
answers it, and the `Client` retries the call for you (**[Multi-round-trip requests](../advanced/multi-round-trip.md)**). On
**2025-11-25** and earlier it is a synchronous elicitation request mid-call. Each question is
asked exactly once per call - a guarantee about the question, not the resolver. In the
multi-round-trip form an eliciting resolver runs again to consume its answer, so code before
its `return Elicit(...)` runs on the asking round and again on the answering one; a resolver
that answered *without* asking, like `check_stock`, may run again whenever the call resumes
after a question. When it resumes, each answer is matched back to its question, so an
eliciting resolver must derive its question deterministically from the tool's arguments and
earlier answers - a per-call generated value (a `default_factory` id, a timestamp) is
re-derived on each round and must not appear in a question the answer is meant to bind to.

## Recap

* `Annotated[T, Resolve(fn)]` on a tool parameter: the SDK runs `fn` and injects its return value.
* A resolved parameter is invisible to the model and cannot be supplied by a client. Values the model must not invent - prices, identities, permissions - belong here.
* A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per call.
* A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per round, however many consumers it has; each question is asked exactly once, an eliciting resolver runs again to consume its answer, and a resolver that never asked may run again when a call resumes.

Check warning on line 138 in docs/tutorial/dependencies.md

View check run for this annotation

Claude / Claude Code Review

dependencies.md still claims resolvers run at most once per call alongside the new per-round contract

Earlier sections of this page still state the old per-call guarantee — line 61 ("the SDK runs the resolver at most once per call, no matter how many declare it") and line 73 ("it runs **once per call**. One inventory lookup, two consumers") — which now contradicts the per-round contract this PR introduces in the new `!!! info` box and the rewritten Recap bullet on the same page. Reword those earlier statements (and the "Don't take once-per-call on faith" / "*Once per call* means exactly that" ti
Comment on lines 136 to +138

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Earlier sections of this page still state the old per-call guarantee — line 61 ("the SDK runs the resolver at most once per call, no matter how many declare it") and line 73 ("it runs once per call. One inventory lookup, two consumers") — which now contradicts the per-round contract this PR introduces in the new !!! info box and the rewritten Recap bullet on the same page. Reword those earlier statements (and the "Don't take once-per-call on faith" / "Once per call means exactly that" tips) to the per-round / asked-once phrasing, or add a forward reference to the info box, so an author doesn't write a non-idempotent resolver relying on once-per-call.

Extended reasoning...

What the issue is. docs/tutorial/dependencies.md is now internally inconsistent about the resolver run-once guarantee. This PR added the !!! info box (lines ~119-132: "a resolver that answered without asking, like check_stock, may run again whenever the call resumes after a question") and rewrote the Recap bullet (line ~138: "at most once per round ... a resolver that never asked may run again when a call resumes"). But the earlier sections of the same page were left with the old contract: line 61 says "every tool that needs stock declares the same parameter, and the SDK runs the resolver at most once per call, no matter how many declare it", line 73 says "it runs once per call. One inventory lookup, two consumers", and the nearby tips reinforce it ("Don't take once-per-call on faith ... one line per call"; "Once per call means exactly that").

Why the old wording is no longer universally true. Under the >= 2026-07-28 input_required flow, only elicited outcomes are persisted in request_state; a resolver that resolves without eliciting is pure and re-runs on every retry round whenever the tool also has an eliciting resolver. The PR's own tests assert exactly this: test_auto_driver_answers_independent_questions_in_a_single_round counts rounds == 2 via a pure resolver that re-runs each round, and test_input_required_resolver_asks_and_consumes_then_never_reruns shows an eliciting resolver running twice (ask + consume). The page's own tutorial003 example (the eliciting confirm_backorder) is now exercised on mode="auto" by tests/docs_src/test_dependencies.py, so the multi-round behaviour applies to the very examples this page teaches.

Step-by-step proof. Take the page's tutorial003 shape extended with the tutorial002 dependency: a tool with stock: Annotated[Stock, Resolve(check_stock)] and backorder: Annotated[Backorder, Resolve(confirm_backorder)] where the title is out of stock, on a 2026-07-28 connection. Round 1: check_stock runs (lookup #1), confirm_backorder returns Elicit(...) with no answer yet, so the server returns an InputRequiredResult and only the (empty) elicited-outcome map goes into request_statecheck_stock's value is not persisted. Round 2: the client retries with the answer; check_stock runs again (lookup #2), confirm_backorder consumes its answer, the body runs. One logical tools/call, two executions of the non-eliciting resolver — directly contradicting "at most once per call" / "one inventory lookup".

Why this looks like an oversight rather than intent. The author reworded the analogous sentences elsewhere in this PR — examples/stories/refund_desk/README.md ("ask each question at most once per call"), refund_desk/client.py comments, the resolve.py module docstring, and this page's own Recap bullet — and the follow-up commit 1a15cb6 ("Scope the resolver run-once guarantee to questions, not resolver bodies") shows this exact wording class is considered worth correcting. Lines 61/73/77 and the "Once per call means exactly that" tip were simply missed. The earlier-flagged stale wording in docs/migration.md is a different file and was already addressed; this is a distinct remaining location.

Impact. Documentation-only, no runtime effect, and the statements remain literally true for the tutorial001/002 examples in their immediate context (no eliciting resolver, so the call completes in one round). But line 61 is phrased as a general design claim, and an author reading it could hang side effects or expensive non-idempotent work on a once-per-call assumption that the same page later retracts — exactly the confusion the new info box exists to prevent.

How to fix. Reword line 61 and line 73 (and the two reinforcing tips) to the per-round / asked-once phrasing the rest of the page now uses — e.g. "the SDK runs the resolver at most once per round, however many declare it" and "each question is asked once; a resolver that never asks may re-run when a call resumes" — or add a short forward reference to the !!! info box where the full contract is stated.

* Bad graphs fail at registration with `InvalidSignature`, not mid-call.
* Return `Elicit(message, Model)` to ask the user, only when you have to. Unwrapped annotations abort on decline; `ElicitationResult[T]` lets the tool branch.

Expand Down
4 changes: 2 additions & 2 deletions docs/tutorial/elicitation.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ A refusal is not an error. The tool decides what declining means (here, no booki

!!! tip
The answer is validated against your model before your code sees it. A client that sends
`"maybe"` for a `bool` doesn't corrupt your booking: the call fails with the
`ValidationError`, your `if` never runs.
`"maybe"` for a `bool` doesn't corrupt your booking: the call fails with a
schema-mismatch error, your `if` never runs.

## Ask before the tool runs

Expand Down
4 changes: 2 additions & 2 deletions examples/stories/legacy_elicitation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,6 @@ uv run python -m stories.legacy_elicitation.client --http --legacy --server serv
## See also

`sampling/` (same push-request shape, deprecated per SEP-2577), `mrtr/`
(planned — the 2026-era carrier), `error_handling/`
(the 2026-era carrier), `error_handling/`
(`UrlElicitationRequiredError`), `refund_desk/` (resolver DI rides this push
mechanism today).
mechanism on handshake-era connections).
5 changes: 2 additions & 3 deletions examples/stories/manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,8 @@ era = "legacy"
status = "legacy"

[story.refund_desk]
# Resolver DI rides push elicitation (ctx.elicit) today; era flips to "dual" once
# the SDK carries resolver elicitation over the 2026 input_required round-trip.
era = "legacy"
# Resolver elicitation picks its transport per era: input_required round-trips on
# the modern leg, push elicitation (ctx.elicit) on the legacy one.
lowlevel = false

[story.sampling]
Expand Down
2 changes: 1 addition & 1 deletion examples/stories/mrtr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ uv run python -m stories.mrtr.client --http --server server_lowlevel

## Spec

[Multi-round results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#multi-round-results)
[Input required tool results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#input-required-tool-results)

## See also

Expand Down
42 changes: 30 additions & 12 deletions examples/stories/refund_desk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@ reason)` refunds what the order record says — `cents` is resolver-computed and
does not appear in the input schema at all, so the model cannot supply or
inflate the amount. Resolvers form a DAG (`load_order` → `refund_scope` →
`refund_amount` / `ask_restock`), may return `Elicit[...]` to ask the human,
and run at most once per call. A resolver's own plain parameters are filled
from the tool's arguments by name — `load_order(order_id)` receives the
`order_id` the model passed to `refund_order`.
and ask each question at most once per call. A resolver's own plain
parameters are filled from the tool's arguments by name —
`load_order(order_id)` receives the `order_id` the model passed to
`refund_order`.

## Run it

Expand All @@ -18,9 +19,9 @@ from the tool's arguments by name — `load_order(order_id)` receives the
uv run python -m stories.refund_desk.client

# HTTP — the client self-hosts the server on a free port, runs, then tears it
# down (--legacy: resolver elicitation rides the push request today; the
# manifest pins this era, so bare --http runs the same leg)
uv run python -m stories.refund_desk.client --http --legacy
# down (2026 protocol: the questions ride embedded input_required round-trips;
# add --legacy to ride synchronous push elicitation instead)
uv run python -m stories.refund_desk.client --http
```

## What to look at
Expand All @@ -47,21 +48,38 @@ uv run python -m stories.refund_desk.client --http --legacy

## Caveats

- **Transport per era.** The framework picks the elicitation transport from
the negotiated protocol: at >= 2026-07-28 the questions ride embedded
`input_required` round-trips (a resolver that depends on another's answer is
asked in a later round); at <= 2025-11-25 each is a synchronous
`elicitation/create` push request mid-call. Author code is identical on
both — this client runs unchanged on either era.
- **Decline order.** A declined unwrapped dependency aborts resolution in
tool-signature order — `cents` resolves before `restock`, so `ask_restock`
never runs. Don't rely on a later resolver's side effects after an earlier
consumer can abort.
- **Memoization scope.** Each resolver runs at most once per `tools/call`,
keyed by function identity; nothing is cached across calls or connections.
- **Memoization scope.** Each question is asked at most once per call, and
within a round each resolver runs at most once, keyed by function identity.
Across 2026 rounds only *elicited* outcomes persist (in `requestState`); a
resolver that resolves without eliciting is pure and may re-run each round.
An eliciting resolver's body runs again too — once to ask, once more to
consume its answer.
An answer is matched back to its question when the call resumes, so an
eliciting resolver must derive its question deterministically from the
tool's arguments and earlier answers; a per-call generated value (a
`default_factory` id, a timestamp) is re-derived each round and must not
appear in a question the answer is meant to bind to. Nothing is cached
across calls or connections.
- **Validate elicited values.** Elicited answers are human-typed; check them
against your records (as `_scoped` does) before acting on them.

## Spec

[Elicitation — client features](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation)
[Elicitation — client features](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation),
[Input required tool results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#input-required-tool-results)

## See also

`legacy_elicitation/` (the push mechanism resolver elicitation rides on today),
`mrtr/` (the 2026 `input_required` carrier; resolver DI will ride it once the
SDK wires them together).
`mrtr/` (the 2026 `input_required` carrier these questions ride at
>= 2026-07-28), `legacy_elicitation/` (the push mechanism they ride on
handshake-era connections).
6 changes: 4 additions & 2 deletions examples/stories/refund_desk/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,9 @@ async def on_elicit(context: ClientRequestContext, params: types.ElicitRequestPa
assert counts == {"scope": 0, "restock": 0}, counts

# Full refund of a three-line order. The scope question fires exactly ONCE even though
# both refund_amount and ask_restock consume it — memoized within the call.
# both refund_amount and ask_restock consume it — asked at most once per call on either
# era. ask_restock needs the scope ANSWER, so at 2026 the two questions land in
# successive rounds, never one concurrent batch: counts and order are era-independent.
receipt = await client.call_tool("refund_order", {"order_id": "ORD-7002", "reason": "arrived broken"})
assert receipt.structured_content == {
"order_id": "ORD-7002",
Expand All @@ -53,7 +55,7 @@ async def on_elicit(context: ClientRequestContext, params: types.ElicitRequestPa

# Declining restock still refunds: the tool keeps the ElicitationResult union for
# `restock`, sees the decline, and just skips the restock. The scope counter moves
# again — the memo cache is per tools/call, not per connection.
# again — questions are deduped per call, not per connection.
declines.add("restock")
answers["scope"] = {"full": False, "sku": "canvas-tote"}
receipt = await client.call_tool("refund_order", {"order_id": "ORD-7002", "reason": "wrong colour"})
Expand Down
39 changes: 28 additions & 11 deletions src/mcp/server/elicitation.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,18 @@ def _validate_rendered_properties(json_schema: dict[str, Any]) -> None:
) from None


def render_elicitation_schema(schema: type[BaseModel]) -> dict[str, Any]:
"""Render a model as the spec-valid `requested_schema` for an elicitation.

Raises:
TypeError: If a field renders as something the spec's
`PrimitiveSchemaDefinition` does not accept.
"""
json_schema = schema.model_json_schema(schema_generator=_ElicitationJsonSchema)
_validate_rendered_properties(json_schema)
return json_schema


async def elicit_with_validation(
session: ServerSession,
message: str,
Expand All @@ -102,27 +114,32 @@ async def elicit_with_validation(
the user or automatically generating a response.

For sensitive data like credentials or OAuth flows, use elicit_url() instead.

Raises:
ValueError: If the client accepted the elicitation without supplying
content, or with content that does not match the requested schema.
"""
json_schema = schema.model_json_schema(schema_generator=_ElicitationJsonSchema)
_validate_rendered_properties(json_schema)
json_schema = render_elicitation_schema(schema)

result = await session.elicit_form(
message=message,
requested_schema=json_schema,
related_request_id=related_request_id,
)

if result.action == "accept" and result.content is not None:
# Validate and parse the content using the schema
validated_data = schema.model_validate(result.content)
if result.action == "accept":
if result.content is None:
raise ValueError("Received an accepted elicitation with no content")
try:
validated_data = schema.model_validate(result.content)
except ValidationError as e:
raise ValueError(
"Received an accepted elicitation whose content does not match the requested schema"
) from e
return AcceptedElicitation(data=validated_data)
elif result.action == "decline":
if result.action == "decline":
return DeclinedElicitation()
elif result.action == "cancel":
return CancelledElicitation()
else: # pragma: no cover
# This should never happen, but handle it just in case
raise ValueError(f"Unexpected elicitation action: {result.action}")
return CancelledElicitation()


async def elicit_url(
Expand Down
5 changes: 5 additions & 0 deletions src/mcp/server/mcpserver/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,11 @@ def request_id(self) -> str:
"""Get the unique ID for this request."""
return str(self.request_context.request_id)

@property
def protocol_version(self) -> str | None:
"""The negotiated protocol version, or `None` outside of an active request."""
return self._request_context.protocol_version if self._request_context is not None else None

@property
def input_responses(self) -> InputResponses | None:
"""Client responses to a prior `InputRequiredResult.input_requests`.
Expand Down
Loading
Loading