modelcontextprotocol · Kludex · Jun 26, 2026 · Jun 26, 2026 · Jun 26, 2026
diff --git a/docs/advanced/middleware.md b/docs/advanced/middleware.md
@@ -79,33 +79,12 @@ In increasing order of how much you should hesitate:
     an elicitation) while handling `initialize` therefore **deadlocks the connection**: the
     response you are waiting for can never be read. Fire-and-forget notifications are fine.
 
-## `OpenTelemetryMiddleware`
-
-The SDK ships one middleware: `OpenTelemetryMiddleware`. Construct it and append it
-(`server.middleware.append(OpenTelemetryMiddleware())`), exactly the line you already wrote
-for `log_timing`.
-
-Every inbound message becomes a `SERVER` span named after the method and its target, so a
-`tools/call` for `search_books` is the span `tools/call search_books`.
-
-* Every span carries `mcp.method.name` and `mcp.protocol.version`; a request's span also
-  carries its JSON-RPC request id (a notification has none).
-* A `tools/call` span gets OpenTelemetry's GenAI semantic conventions,
-  `gen_ai.operation.name` (`"execute_tool"`) and `gen_ai.tool.name`, so a tracing UI groups
-  your tool calls the way it groups any other agent's. A `prompts/get` span gets
-  `gen_ai.prompt.name`. The list methods carry no `gen_ai.*` keys.
-* A handler that raises sets the span's status to error. So does a tool result with
-  `is_error=True`.
-
-!!! tip
-    The SDK depends only on `opentelemetry-api`. With no exporter installed those spans are
-    no-ops, so appending this middleware costs you nothing. Install `opentelemetry-sdk` plus an
-    exporter and everything lights up, with no server change.
-
-The import is the catch. The class lives at `from mcp.server._otel import OpenTelemetryMiddleware`
-today, and the leading underscore is not an accident: it is the same provisional flag this whole
-page opened with. The SDK has not given it a public spelling yet, so the import path is the one
-line here you should expect to change.
+## The one middleware that ships on by default
+
+The SDK ships exactly one middleware, and it is already on your server's list: the one that
+emits an OpenTelemetry span for every message. You don't append it, and most of the time you
+don't think about it. It is a no-op until you install an exporter, and it has its own page:
+**OpenTelemetry**.
 
 !!! info
     If you have written ASGI middleware, you already know this shape. Starlette's
@@ -121,9 +100,8 @@ line here you should expect to change.
   unknown methods) and runs outermost-first.
 * `ctx.request_id is None` is how you tell a notification from a request.
 * Raise instead of calling `call_next` to refuse one message; the connection survives.
-* `OpenTelemetryMiddleware` turns each message into a span (with GenAI attributes on tool
-  calls and prompt gets) for the price of one `append`, and costs nothing until you install
-  an exporter.
+* The SDK's own OpenTelemetry tracing is a middleware too, already on the list. See
+  **OpenTelemetry**.
 * The whole surface is provisional. Observe with it; don't build on it.
 
 That is everything that wraps a request. **Authorization** is what decides whether the request

diff --git a/docs/advanced/opentelemetry.md b/docs/advanced/opentelemetry.md
@@ -0,0 +1,107 @@
+# OpenTelemetry
+
+Your server is already traced. You don't have to add anything.
+
+Every server you create emits an [OpenTelemetry](https://opentelemetry.io/) span for every
+message it handles. You didn't write that, and you don't import it. It is there the moment you
+call `MCPServer(...)`.
+
+```python title="server.py"
+--8<-- "docs_src/opentelemetry/tutorial001.py"
+```
+
+That is a complete, traced server. Call `search_books` and a span is created for it. The same is
+true for the low-level `Server`: the tracing lives on both.
+
+## What you get
+
+Every inbound message becomes a `SERVER` span named after the method and its target. So a
+`tools/call` for `search_books` is the span `tools/call search_books`, and a bare `tools/list`
+is just `tools/list`.
+
+Each span carries a few attributes:
+
+* `mcp.method.name` and `mcp.protocol.version`, on every span.
+* `jsonrpc.request.id`, on a request (a notification has none).
+* A handler that raises sets the span status to error. So does a tool result with `is_error=True`.
+
+And because tracing a tool call is such a common thing to want, `tools/call` spans speak
+OpenTelemetry's [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/):
+
+* `gen_ai.operation.name`, set to `"execute_tool"`.
+* `gen_ai.tool.name`, set to the tool being called.
+
+A `prompts/get` span gets `gen_ai.prompt.name` in the same spirit. The list methods carry no
+`gen_ai.*` keys, because there is nothing to name.
+
+!!! tip
+    Those GenAI attributes are the reason a tracing UI groups your tool calls the way it groups
+    any other agent's. You get that grouping for free, with no extra code.
+
+## It costs nothing until you want it
+
+Here is the part that makes "on by default" a comfortable default.
+
+The SDK depends only on `opentelemetry-api`, the lightweight half of OpenTelemetry. With no SDK
+and no exporter installed, creating a span is a no-op. So the spans your server is emitting right
+now cost you almost nothing, and nobody is collecting them.
+
+The day you want to *see* them, you install the other half and point it somewhere:
+
+```console
+uv add opentelemetry-sdk opentelemetry-exporter-otlp
+```
+
+Configure an exporter the usual OpenTelemetry way, and every span the SDK has been quietly
+creating lights up. Your server code does not change. Not one line.
+
+!!! info
+    [Pydantic Logfire](https://logfire.pydantic.dev/) is one such backend, and it does the
+    configuration for you: `pip install logfire`, `logfire.configure()`, and your MCP spans show
+    up in the live view. It is built on OpenTelemetry, so anything below applies to it too.
+
+## Traces that cross the wire
+
+A trace is most useful when it follows a request from the client into the server, in one
+connected picture.
+
+When the client and the server both run the SDK, that connection is automatic. The client injects
+the [W3C trace context](https://www.w3.org/TR/trace-context/) into the request, and the server
+reads it back out, so the server span nests under the client span in the same trace. This is
+[SEP-414](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/414), and you get it without
+asking.
+
+If the inbound message carries no trace context, for example a request from a client that is not
+the SDK, the server span simply parents to whatever span is already current on the server, rather
+than starting a brand-new orphan trace.
+
+## Turning it off
+
+Tracing is a middleware, the first one on your server's list. If you really want a server that
+emits no spans, take it off:
+
+```python
+from mcp.server._otel import OpenTelemetryMiddleware
+
+mcp._lowlevel_server.middleware[:] = [
+    m for m in mcp._lowlevel_server.middleware if not isinstance(m, OpenTelemetryMiddleware)
+]
+```
+
+!!! warning
+    That import has a leading underscore, and that is on purpose. The class is provisional, the
+    same way [`Server.middleware`](middleware.md) is provisional, so the import path is something
+    you should expect to change. You almost never need this: with no exporter installed the spans
+    are free, so the usual answer is to leave them on and not install an exporter.
+
+## Recap
+
+* Every `MCPServer` and every low-level `Server` emits one `SERVER` span per inbound message, out
+  of the box. You write nothing.
+* Spans carry `mcp.method.name` and `mcp.protocol.version`; `tools/call` and `prompts/get` also
+  carry GenAI attributes so your tool calls group like any other agent's.
+* It costs nothing until you install an OpenTelemetry SDK and an exporter, and then it lights up
+  with no change to your server.
+* Client-to-server trace context propagates automatically when both sides run the SDK.
+
+Next, the thing that decides whether a request runs at all: **Authorization**.
diff --git a/docs/tutorial/logging.md b/docs/tutorial/logging.md
@@ -64,8 +64,8 @@ went to standard error: the terminal, not the wire.
 
 !!! info
     If what you actually want is *tracing* (every request, how long it took, whether it failed), you
-    don't want log lines, you want spans. The SDK ships an `OpenTelemetryMiddleware` for exactly that.
-    See **Middleware**.
+    don't want log lines, you want spans. Your server already emits them: the SDK traces every
+    message with OpenTelemetry out of the box. See **OpenTelemetry**.
 
 ## Recap
 

diff --git a/docs_src/opentelemetry/__init__.py b/docs_src/opentelemetry/__init__.py
diff --git a/docs_src/opentelemetry/tutorial001.py b/docs_src/opentelemetry/tutorial001.py
@@ -0,0 +1,9 @@
+from mcp.server import MCPServer
+
+mcp = MCPServer("Bookshop")
+
+
+@mcp.tool()
+def search_books(query: str) -> str:
+    """Search the catalog by title or author."""
+    return f"Found 3 books matching {query!r}."
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -42,6 +42,7 @@ nav:
       - The low-level Server: advanced/low-level-server.md
       - Pagination: advanced/pagination.md
       - Middleware: advanced/middleware.md
+      - OpenTelemetry: advanced/opentelemetry.md
       - Authorization: advanced/authorization.md
       - OAuth clients: advanced/oauth-clients.md
       - Session groups: advanced/session-groups.md

diff --git a/src/mcp/server/lowlevel/server.py b/src/mcp/server/lowlevel/server.py
@@ -52,6 +52,7 @@ async def main():
 from starlette.routing import Mount, Route
 from typing_extensions import TypeVar
 
+from mcp.server._otel import OpenTelemetryMiddleware
 from mcp.server.auth.middleware.auth_context import AuthContextMiddleware
 from mcp.server.auth.middleware.bearer_auth import BearerAuthBackend, RequireAuthMiddleware
 from mcp.server.auth.provider import OAuthAuthorizationServerProvider, TokenVerifier
@@ -231,10 +232,13 @@ def __init__(
         # Context-tier middleware: wraps every inbound request (including
         # `initialize`, lookup, validation, handler) with
         # `(ctx, call_next)`. Applied in `ServerRunner._on_request`.
+        # `OpenTelemetryMiddleware` ships on by default so every server emits a
+        # SERVER span per message; it is a no-op until an OTel exporter is
+        # installed. Drop it from this list to opt out.
         # TODO(L54): provisional - signature and semantics change with the
         # Context/middleware rework (covariant `Context[L]`, outbound seam) before
         # v2 final.
-        self.middleware: list[ServerMiddleware[LifespanResultT]] = []
+        self.middleware: list[ServerMiddleware[LifespanResultT]] = [OpenTelemetryMiddleware()]
         logger.debug("Initializing server %r", name)
 
         _spec_requests: list[tuple[str, type[BaseModel], RequestHandler[LifespanResultT, Any] | None]] = [

diff --git a/src/mcp/server/runner.py b/src/mcp/server/runner.py
@@ -37,15 +37,13 @@
 )
 from mcp_types import methods as _methods
 from mcp_types.version import HANDSHAKE_PROTOCOL_VERSIONS, LATEST_HANDSHAKE_VERSION, LATEST_MODERN_VERSION
-from opentelemetry.trace import SpanKind, StatusCode
 from pydantic import BaseModel, ValidationError
 from typing_extensions import TypeVar
 
 from mcp.server.connection import Connection
 from mcp.server.context import CallNext, HandlerResult, ServerMiddleware, ServerRequestContext
 from mcp.server.models import InitializationOptions
 from mcp.server.session import ServerSession
-from mcp.shared._otel import extract_trace_context, otel_span
 from mcp.shared._stream_protocols import ReadStream, WriteStream
 from mcp.shared.dispatcher import DispatchContext, Dispatcher, DispatchMiddleware, OnNotify, OnRequest
 from mcp.shared.exceptions import MCPError
@@ -62,7 +60,6 @@
     "ServerRunner",
     "aclose_shielded",
     "modern_on_request",
-    "otel_middleware",
     "serve_connection",
     "serve_loop",
     "serve_one",
@@ -91,58 +88,6 @@ def _extract_meta(params: Mapping[str, Any] | None) -> RequestParamsMeta | None:
         return None
 
 
-def otel_middleware(call_next: OnRequest) -> OnRequest:
-    """Dispatch-tier middleware that wraps each request in an OpenTelemetry span.
-
-    Mirrors the span shape of the existing `Server._handle_request`: span name
-    `"MCP handle <method> [<target>]"`, `mcp.method.name` attribute, W3C
-    trace context extracted from `params._meta` (SEP-414), and an ERROR
-    status if the handler raises.
-    """
-
-    async def wrapped(
-        dctx: DispatchContext[TransportContext], method: str, params: Mapping[str, Any] | None
-    ) -> dict[str, Any]:
-        target: str | None
-        match params:
-            case {"name": str() as target}:
-                pass
-            case _:
-                target = None
-        parent: Any | None
-        match params:
-            case {"_meta": {**meta}}:
-                parent = extract_trace_context(meta)
-            case _:
-                parent = None
-        span_name = f"MCP handle {method}{f' {target}' if target else ''}"
-        # `otel_middleware` wraps `on_request` only, so `request_id` is always set.
-        attributes = {"mcp.method.name": method, "jsonrpc.request.id": str(dctx.request_id)}
-        with otel_span(
-            span_name,
-            kind=SpanKind.SERVER,
-            attributes=attributes,
-            context=parent,
-            record_exception=False,
-            set_status_on_exception=False,
-        ) as span:
-            try:
-                return await call_next(dctx, method, params)
-            except MCPError as e:
-                span.set_status(StatusCode.ERROR, e.error.message)
-                raise
-            except ValidationError:
-                # Mirror the sanitized wire response; pydantic messages carry client input.
-                span.set_status(StatusCode.ERROR, "Invalid request parameters")
-                raise
-            except Exception as e:
-                span.record_exception(e)
-                span.set_status(StatusCode.ERROR, str(e))
-                raise
-
-    return wrapped
-
-
 def _dump_result(result: Any) -> dict[str, Any]:
     if result is None:
         return {}
@@ -196,7 +141,10 @@ class ServerRunner(Generic[LifespanT]):
     _: KW_ONLY
     init_options: InitializationOptions | None = None
     """`InitializeResult` payload. Defaults to `server.create_initialization_options()`."""
-    dispatch_middleware: Sequence[DispatchMiddleware] = (otel_middleware,)
+    dispatch_middleware: Sequence[DispatchMiddleware] = ()
+    """Raw dispatch-tier wrappers `(dctx, method, params) -> dict`, applied outermost-first
+    around `_on_request`. Empty by default; OpenTelemetry tracing lives at the context tier
+    (`OpenTelemetryMiddleware`, seeded into `Server.middleware`)."""
 
     @cached_property
     def on_request(self) -> OnRequest:
@@ -223,7 +171,6 @@ async def _on_request(
         meta = _extract_meta(params)
         version = self.connection.protocol_version
         ctx = self._make_context(dctx, method, params, meta, version)
-        is_spec_method = method in _methods.SPEC_CLIENT_METHODS
 
         async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
             # Read method/params off `ctx` so a middleware that rewrote them via
@@ -242,7 +189,7 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
             # the gate become a per-version legacy path then. Initialize runs inline
             # (read loop parked), so awaiting the peer anywhere on this path deadlocks.
             if method == "initialize":
-                return self._handle_initialize(params)
+                return self._serialize(method, version, self._handle_initialize(params))
             # Methods without a handler are METHOD_NOT_FOUND regardless of
             # initialization state: JSON-RPC 2.0 reserves -32601 for "not
             # available on this server", and clients probing a server before
@@ -261,25 +208,14 @@ async def _inner(ctx: ServerRequestContext[LifespanT, Any]) -> HandlerResult:
             if isinstance(result, ErrorData):
                 # Raise inside the chain so middleware observes the failure.
                 raise MCPError.from_error_data(result)
-            return result
+            # Dump and serialize inside the chain so the OpenTelemetry span (the
+            # outermost middleware) records a failing handler return shape too.
+            return self._serialize(method, version, result)
 
         call = self._compose_server_middleware(_inner)
+        # `_inner` already produced the wire dict; a middleware that short-circuited
+        # without `call_next` is trusted to return its own well-formed result.
         result = _dump_result(await call(ctx))
-        # TODO(L56): reject resultType values outside {"complete", "input_required"} unless the
-        # corresponding extension is in this request's _meta clientCapabilities.extensions; the
-        # explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively.
-        if is_spec_method:
-            try:
-                result = _methods.serialize_server_result(method, version, result)
-            except KeyError:
-                # Middleware short-circuited a wrong-version spec method without
-                # calling `call_next`; it owns the result shape.
-                pass
-            except ValidationError:
-                # Server bug, not client fault. Detail stays in the server log:
-                # pydantic messages echo the result body.
-                logger.exception("handler for %r returned an invalid result", method)
-                raise MCPError(code=INTERNAL_ERROR, message="Handler returned an invalid result") from None
         if method == "initialize":
             # Commit only on chain success, so a middleware veto leaves no state.
             # Race-free: the read loop is parked until this call returns.
@@ -387,6 +323,28 @@ def _make_context(
             close_standalone_sse_stream=close_standalone_sse_stream,
         )
 
+    @staticmethod
+    def _serialize(method: str, version: str, result: HandlerResult) -> dict[str, Any]:
+        """Dump a handler result to the wire dict, serializing spec methods.
+
+        Runs inside the middleware chain so the OpenTelemetry span observes a
+        failing return shape (unsupported type, malformed spec result) as an
+        error rather than closing on a request that the client sees fail.
+        """
+        dumped = _dump_result(result)
+        # TODO(L56): reject resultType values outside {"complete", "input_required"} unless the
+        # corresponding extension is in this request's _meta clientCapabilities.extensions; the
+        # explicit MUST-reject is client-side (basic/index.mdx ResultType), this enforces it proactively.
+        if method not in _methods.SPEC_CLIENT_METHODS:
+            return dumped
+        try:
+            return _methods.serialize_server_result(method, version, dumped)
+        except ValidationError:
+            # Server bug, not client fault. Detail stays in the server log:
+            # pydantic messages echo the result body.
+            logger.exception("handler for %r returned an invalid result", method)
+            raise MCPError(code=INTERNAL_ERROR, message="Handler returned an invalid result") from None
+
     @staticmethod
     def _negotiate_initialize(params: Mapping[str, Any] | None) -> tuple[InitializeRequestParams, str]:
         """Validate `initialize` params and pick the protocol version."""