Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 8 additions & 30 deletions docs/advanced/middleware.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,33 +79,12 @@ In increasing order of how much you should hesitate:
an elicitation) while handling `initialize` therefore **deadlocks the connection**: the
response you are waiting for can never be read. Fire-and-forget notifications are fine.

## `OpenTelemetryMiddleware`

The SDK ships one middleware: `OpenTelemetryMiddleware`. Construct it and append it
(`server.middleware.append(OpenTelemetryMiddleware())`), exactly the line you already wrote
for `log_timing`.

Every inbound message becomes a `SERVER` span named after the method and its target, so a
`tools/call` for `search_books` is the span `tools/call search_books`.

* Every span carries `mcp.method.name` and `mcp.protocol.version`; a request's span also
carries its JSON-RPC request id (a notification has none).
* A `tools/call` span gets OpenTelemetry's GenAI semantic conventions,
`gen_ai.operation.name` (`"execute_tool"`) and `gen_ai.tool.name`, so a tracing UI groups
your tool calls the way it groups any other agent's. A `prompts/get` span gets
`gen_ai.prompt.name`. The list methods carry no `gen_ai.*` keys.
* A handler that raises sets the span's status to error. So does a tool result with
`is_error=True`.

!!! tip
The SDK depends only on `opentelemetry-api`. With no exporter installed those spans are
no-ops, so appending this middleware costs you nothing. Install `opentelemetry-sdk` plus an
exporter and everything lights up, with no server change.

The import is the catch. The class lives at `from mcp.server._otel import OpenTelemetryMiddleware`
today, and the leading underscore is not an accident: it is the same provisional flag this whole
page opened with. The SDK has not given it a public spelling yet, so the import path is the one
line here you should expect to change.
## The one middleware that ships on by default

The SDK ships exactly one middleware, and it is already on your server's list: the one that
emits an OpenTelemetry span for every message. You don't append it, and most of the time you
don't think about it. It is a no-op until you install an exporter, and it has its own page:
**OpenTelemetry**.

!!! info
If you have written ASGI middleware, you already know this shape. Starlette's
Expand All @@ -121,9 +100,8 @@ line here you should expect to change.
unknown methods) and runs outermost-first.
* `ctx.request_id is None` is how you tell a notification from a request.
* Raise instead of calling `call_next` to refuse one message; the connection survives.
* `OpenTelemetryMiddleware` turns each message into a span (with GenAI attributes on tool
calls and prompt gets) for the price of one `append`, and costs nothing until you install
an exporter.
* The SDK's own OpenTelemetry tracing is a middleware too, already on the list. See
**OpenTelemetry**.
* The whole surface is provisional. Observe with it; don't build on it.

That is everything that wraps a request. **Authorization** is what decides whether the request
Expand Down
107 changes: 107 additions & 0 deletions docs/advanced/opentelemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# OpenTelemetry

Your server is already traced. You don't have to add anything.

Every server you create emits an [OpenTelemetry](https://opentelemetry.io/) span for every
message it handles. You didn't write that, and you don't import it. It is there the moment you
call `MCPServer(...)`.

```python title="server.py"
--8<-- "docs_src/opentelemetry/tutorial001.py"
```

That is a complete, traced server. Call `search_books` and a span is created for it. The same is
true for the low-level `Server`: the tracing lives on both.

## What you get

Every inbound message becomes a `SERVER` span named after the method and its target. So a
`tools/call` for `search_books` is the span `tools/call search_books`, and a bare `tools/list`
is just `tools/list`.

Each span carries a few attributes:

* `mcp.method.name` and `mcp.protocol.version`, on every span.
* `jsonrpc.request.id`, on a request (a notification has none).
* A handler that raises sets the span status to error. So does a tool result with `is_error=True`.

And because tracing a tool call is such a common thing to want, `tools/call` spans speak
OpenTelemetry's [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/):

* `gen_ai.operation.name`, set to `"execute_tool"`.
* `gen_ai.tool.name`, set to the tool being called.

A `prompts/get` span gets `gen_ai.prompt.name` in the same spirit. The list methods carry no
`gen_ai.*` keys, because there is nothing to name.

!!! tip
Those GenAI attributes are the reason a tracing UI groups your tool calls the way it groups
any other agent's. You get that grouping for free, with no extra code.

## It costs nothing until you want it

Here is the part that makes "on by default" a comfortable default.

The SDK depends only on `opentelemetry-api`, the lightweight half of OpenTelemetry. With no SDK
and no exporter installed, creating a span is a no-op. So the spans your server is emitting right
now cost you almost nothing, and nobody is collecting them.

The day you want to *see* them, you install the other half and point it somewhere:

```console
uv add opentelemetry-sdk opentelemetry-exporter-otlp
```

Configure an exporter the usual OpenTelemetry way, and every span the SDK has been quietly
creating lights up. Your server code does not change. Not one line.

!!! info
[Pydantic Logfire](https://logfire.pydantic.dev/) is one such backend, and it does the
configuration for you: `pip install logfire`, `logfire.configure()`, and your MCP spans show
up in the live view. It is built on OpenTelemetry, so anything below applies to it too.

## Traces that cross the wire

A trace is most useful when it follows a request from the client into the server, in one
connected picture.

When the client and the server both run the SDK, that connection is automatic. The client injects
the [W3C trace context](https://www.w3.org/TR/trace-context/) into the request, and the server
reads it back out, so the server span nests under the client span in the same trace. This is
[SEP-414](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/414), and you get it without
asking.

Comment thread
claude[bot] marked this conversation as resolved.
If the inbound message carries no trace context, for example a request from a client that is not
the SDK, the server span simply parents to whatever span is already current on the server, rather
than starting a brand-new orphan trace.

## Turning it off

Tracing is a middleware, the first one on your server's list. If you really want a server that
emits no spans, take it off:

```python
from mcp.server._otel import OpenTelemetryMiddleware

mcp._lowlevel_server.middleware[:] = [
m for m in mcp._lowlevel_server.middleware if not isinstance(m, OpenTelemetryMiddleware)
]
```

!!! warning
That import has a leading underscore, and that is on purpose. The class is provisional, the
same way [`Server.middleware`](middleware.md) is provisional, so the import path is something
you should expect to change. You almost never need this: with no exporter installed the spans
are free, so the usual answer is to leave them on and not install an exporter.

## Recap

* Every `MCPServer` and every low-level `Server` emits one `SERVER` span per inbound message, out
of the box. You write nothing.
* Spans carry `mcp.method.name` and `mcp.protocol.version`; `tools/call` and `prompts/get` also
carry GenAI attributes so your tool calls group like any other agent's.
* It costs nothing until you install an OpenTelemetry SDK and an exporter, and then it lights up
with no change to your server.
* Client-to-server trace context propagates automatically when both sides run the SDK.

Next, the thing that decides whether a request runs at all: **Authorization**.
4 changes: 2 additions & 2 deletions docs/tutorial/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ went to standard error: the terminal, not the wire.

!!! info
If what you actually want is *tracing* (every request, how long it took, whether it failed), you
don't want log lines, you want spans. The SDK ships an `OpenTelemetryMiddleware` for exactly that.
See **Middleware**.
don't want log lines, you want spans. Your server already emits them: the SDK traces every
message with OpenTelemetry out of the box. See **OpenTelemetry**.

## Recap

Expand Down
Empty file.
9 changes: 9 additions & 0 deletions docs_src/opentelemetry/tutorial001.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from mcp.server import MCPServer

mcp = MCPServer("Bookshop")


@mcp.tool()
def search_books(query: str) -> str:
"""Search the catalog by title or author."""
return f"Found 3 books matching {query!r}."
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ nav:
- The low-level Server: advanced/low-level-server.md
- Pagination: advanced/pagination.md
- Middleware: advanced/middleware.md
- OpenTelemetry: advanced/opentelemetry.md
- Authorization: advanced/authorization.md
- OAuth clients: advanced/oauth-clients.md
- Session groups: advanced/session-groups.md
Expand Down
6 changes: 5 additions & 1 deletion src/mcp/server/lowlevel/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ async def main():
from starlette.routing import Mount, Route
from typing_extensions import TypeVar

from mcp.server._otel import OpenTelemetryMiddleware
from mcp.server.auth.middleware.auth_context import AuthContextMiddleware
from mcp.server.auth.middleware.bearer_auth import BearerAuthBackend, RequireAuthMiddleware
from mcp.server.auth.provider import OAuthAuthorizationServerProvider, TokenVerifier
Expand Down Expand Up @@ -231,10 +232,13 @@ def __init__(
# Context-tier middleware: wraps every inbound request (including
# `initialize`, lookup, validation, handler) with
# `(ctx, call_next)`. Applied in `ServerRunner._on_request`.
# `OpenTelemetryMiddleware` ships on by default so every server emits a
# SERVER span per message; it is a no-op until an OTel exporter is
# installed. Drop it from this list to opt out.
# TODO(L54): provisional - signature and semantics change with the
# Context/middleware rework (covariant `Context[L]`, outbound seam) before
# v2 final.
self.middleware: list[ServerMiddleware[LifespanResultT]] = []
self.middleware: list[ServerMiddleware[LifespanResultT]] = [OpenTelemetryMiddleware()]
logger.debug("Initializing server %r", name)

_spec_requests: list[tuple[str, type[BaseModel], RequestHandler[LifespanResultT, Any] | None]] = [
Expand Down
106 changes: 32 additions & 74 deletions src/mcp/server/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,13 @@
)
from mcp_types import methods as _methods
from mcp_types.version import HANDSHAKE_PROTOCOL_VERSIONS, LATEST_HANDSHAKE_VERSION, LATEST_MODERN_VERSION
from opentelemetry.trace import SpanKind, StatusCode
from pydantic import BaseModel, ValidationError
from typing_extensions import TypeVar

from mcp.server.connection import Connection
from mcp.server.context import CallNext, HandlerResult, ServerMiddleware, ServerRequestContext
from mcp.server.models import InitializationOptions
from mcp.server.session import ServerSession
from mcp.shared._otel import extract_trace_context, otel_span
from mcp.shared._stream_protocols import ReadStream, WriteStream
from mcp.shared.dispatcher import DispatchContext, Dispatcher, DispatchMiddleware, OnNotify, OnRequest
from mcp.shared.exceptions import MCPError
Expand All @@ -62,7 +60,6 @@
"ServerRunner",
"aclose_shielded",
"modern_on_request",
"otel_middleware",
"serve_connection",
"serve_loop",
"serve_one",
Expand Down Expand Up @@ -91,58 +88,6 @@ def _extract_meta(params: Mapping[str, Any] | None) -> RequestParamsMeta | None:
return None


def otel_middleware(call_next: OnRequest) -> OnRequest:
"""Dispatch-tier middleware that wraps each request in an OpenTelemetry span.

Mirrors the span shape of the existing `Server._handle_request`: span name
`"MCP handle <method> [<target>]"`, `mcp.method.name` attribute, W3C
trace context extracted from `params._meta` (SEP-414), and an ERROR
status if the handler raises.
"""

async def wrapped(
dctx: DispatchContext[TransportContext], method: str, params: Mapping[str, Any] | None
) -> dict[str, Any]:
target: str | None
match params:
case {"name": str() as target}:
pass
case _:
target = None
parent: Any | None
match params:
case {"_meta": {**meta}}:
parent = extract_trace_context(meta)
case _:
parent = None
span_name = f"MCP handle {method}{f' {target}' if target else ''}"
# `otel_middleware` wraps `on_request` only, so `request_id` is always set.
attributes = {"mcp.method.name": method, "jsonrpc.request.id": str(dctx.request_id)}
with otel_span(
span_name,
kind=SpanKind.SERVER,
attributes=attributes,
context=parent,
record_exception=False,
set_status_on_exception=False,
) as span:
try:
return await call_next(dctx, method, params)
except MCPError as e:
span.set_status(StatusCode.ERROR, e.error.message)
raise
except ValidationError:
# Mirror the sanitized wire response; pydantic messages carry client input.
span.set_status(StatusCode.ERROR, "Invalid request parameters")
raise
except Exception as e:
span.record_exception(e)
span.set_status(StatusCode.ERROR, str(e))
raise

return wrapped


def _dump_result(result: Any) -> dict[str, Any]:
if result is None:
return {}
Expand Down Expand Up @@ -196,7 +141,10 @@ class ServerRunner(Generic[LifespanT]):
_: KW_ONLY
init_options: InitializationOptions | None = None
"""`InitializeResult` payload. Defaults to `server.create_initialization_options()`."""
dispatch_middleware: Sequence[DispatchMiddleware] = (otel_middleware,)
dispatch_middleware: Sequence[DispatchMiddleware] = ()
"""Raw dispatch-tier wrappers `(dctx, method, params) -> dict`, applied outermost-first
around `_on_request`. Empty by default; OpenTelemetry tracing lives at the context tier
(`OpenTelemetryMiddleware`, seeded into `Server.middleware`)."""
Comment thread
claude[bot] marked this conversation as resolved.

@cached_property
def on_request(self) -> OnRequest:
Expand All @@ -223,7 +171,6 @@ async def _on_request(
meta = _extract_meta(params)
version = self.connection.protocol_version
ctx = self._make_context(dctx, method, params, meta, version)
is_spec_method = method in _methods.SPEC_CLIENT_METHODS

async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
# Read method/params off `ctx` so a middleware that rewrote them via
Expand All @@ -242,7 +189,7 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
# the gate become a per-version legacy path then. Initialize runs inline
# (read loop parked), so awaiting the peer anywhere on this path deadlocks.
if method == "initialize":
return self._handle_initialize(params)
return self._serialize(method, version, self._handle_initialize(params))
# Methods without a handler are METHOD_NOT_FOUND regardless of
# initialization state: JSON-RPC 2.0 reserves -32601 for "not
# available on this server", and clients probing a server before
Expand All @@ -261,25 +208,14 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
if isinstance(result, ErrorData):
# Raise inside the chain so middleware observes the failure.
raise MCPError.from_error_data(result)
return result
# Dump and serialize inside the chain so the OpenTelemetry span (the
# outermost middleware) records a failing handler return shape too.
return self._serialize(method, version, result)

call = self._compose_server_middleware(_inner)
# `_inner` already produced the wire dict; a middleware that short-circuited
# without `call_next` is trusted to return its own well-formed result.
result = _dump_result(await call(ctx))
# TODO(L56): reject resultType values outside {"complete", "input_required"} unless the
# corresponding extension is in this request's _meta clientCapabilities.extensions; the
# explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively.
if is_spec_method:
try:
result = _methods.serialize_server_result(method, version, result)
except KeyError:
# Middleware short-circuited a wrong-version spec method without
# calling `call_next`; it owns the result shape.
pass
except ValidationError:
# Server bug, not client fault. Detail stays in the server log:
# pydantic messages echo the result body.
logger.exception("handler for %r returned an invalid result", method)
raise MCPError(code=INTERNAL_ERROR, message="Handler returned an invalid result") from None
if method == "initialize":
# Commit only on chain success, so a middleware veto leaves no state.
# Race-free: the read loop is parked until this call returns.
Expand Down Expand Up @@ -387,6 +323,28 @@ def _make_context(
close_standalone_sse_stream=close_standalone_sse_stream,
)

@staticmethod
def _serialize(method: str, version: str, result: HandlerResult) -> dict[str, Any]:
"""Dump a handler result to the wire dict, serializing spec methods.

Runs inside the middleware chain so the OpenTelemetry span observes a
failing return shape (unsupported type, malformed spec result) as an
error rather than closing on a request that the client sees fail.
"""
dumped = _dump_result(result)
# TODO(L56): reject resultType values outside {"complete", "input_required"} unless the
# corresponding extension is in this request's _meta clientCapabilities.extensions; the
# explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively.
if method not in _methods.SPEC_CLIENT_METHODS:
return dumped
try:
return _methods.serialize_server_result(method, version, dumped)
except ValidationError:
# Server bug, not client fault. Detail stays in the server log:
# pydantic messages echo the result body.
logger.exception("handler for %r returned an invalid result", method)
raise MCPError(code=INTERNAL_ERROR, message="Handler returned an invalid result") from None

@staticmethod
def _negotiate_initialize(params: Mapping[str, Any] | None) -> tuple[InitializeRequestParams, str]:
"""Validate `initialize` params and pick the protocol version."""
Expand Down
Loading
Loading