Why ZCP Outperforms MCP

A code-grounded technical report on token economics, canonical runtime design, and why ZCP de-centers JSON Schema on the model-facing path while MCP keeps it at the center of the tool contract.

Abstract

This report makes a narrower claim than generic marketing copy. ZCP does not outperform MCP because it changed the name of the protocol or because it uses a different transport. It outperforms MCP in planning-heavy workloads because it changes what the model has to see, remember, and branch over at each turn.

In the official MCP Python SDK, the default server path is schema-first: functions become Pydantic models, those models become JSON Schema, `tools/list` returns the full schema-bearing tool set, and `CallToolResult` returns prompt-visible content plus optional structured content. That is reasonable for interoperability.

In ZCP, schema validation still exists, but it is no longer the center of the native runtime contract. The core contract becomes `ToolDefinition + SessionState + HandleRef + TaskExecutionContext`. JSON Schema is compiled at the adapter edge when needed, while the native runtime shrinks tool visibility, compacts results, and externalizes long-running state.

That is why the published benchmark result, `8027.9` tokens for native ZCP versus `30723.7` for the MCP surface, is not accidental. The model sees fewer schemas, fewer branches, fewer large result replays, and fewer prompt-visible repair loops.

Primary claim

token cost is a runtime problem

MCP solves compatibility well; ZCP optimizes the model-facing execution contract below that boundary.

Published result

3.83x

Overall token advantage in `full_semantic_compare_v5` on the same backend family.

Most important shift

JSON Schema moves to the edge

ZCP retains schema validation but stops treating full JSON Schema as the native planning surface.

Figure 1. Boundary Protocol Versus Native Runtime

The architecture is intentionally split. MCP remains the outer compatibility contract; ZCP changes the native execution contract inside the same backend.

clients and hosts

MCP hosts / MCP clients

Native ZCP clients

model-facing surface

MCP schema-first surface

ZCP canonical surface

runtime core

shared business logic, tools, resources, prompts

canonical registry + handle store + task state

1. Problem Statement

MCP is an interoperability protocol. Its job is to let tools, resources, prompts, and transports be described in a shared way across hosts and clients. The official Python SDK reflects that goal directly: it builds tool contracts from Python function signatures and serializes them as JSON Schema.

That solves the boundary problem, but it does not solve the model execution problem. The model still pays for every visible tool, every repeated schema field, every prompt-visible result replay, and every loop where runtime state is simulated inside natural language or repeated tool polling.

The right comparison is therefore not 'which protocol can express a tool call?' Both can. The right comparison is 'what does the model need to reason over per turn?' ZCP wins only when the answer to that question becomes smaller.

2. Why ZCP Uses Fewer Tokens

Token cost comes from four recurring sources. First, the model is shown too many tools. Second, each tool is described with too much schema detail relative to the task at hand. Third, large results are replayed into later turns. Fourth, background or long-running state is not held by the runtime, so the model keeps reconstructing it through repeated calls and explanations.

The MCP default path amplifies those four costs because the public tool contract is also the default model-facing contract. `Tool.from_function(...)` creates `parameters = arg_model.model_json_schema(by_alias=True)`, `list_tools()` returns every registered tool with `input_schema` and `output_schema`, and `_handle_call_tool()` turns outputs back into `CallToolResult(content=..., structured_content=...)`.

ZCP reduces those costs by moving policy into the runtime. Tool discovery can be cut down before the first turn. Result values can be represented as `scalar` or as `handle + summary` rather than replaying full payloads. Task state can live in `TaskManager` instead of being re-encoded into prompt-visible loops. The benchmark does not need mystery once that chain is visible in code.

Fewer visible tools means lower branch factor.
Smaller registry subsets mean less repeated schema payload.
Handles keep large artifacts out of subsequent turns.
Tasks keep long-running state out of the prompt.

3. Canonical Runtime And Context Contract

The decisive ZCP move is architectural: the public MCP-compatible surface is not the native runtime. The native runtime is defined around canonical objects such as `ToolDefinition`, `SessionState`, `CallRequest`, `CallResult`, and `HandleRef` in `src/zcp/canonical_protocol.py`.

Those types store information that matters for model execution but is not central in a schema-first design: `output_mode`, `handle_kind`, `defaults`, `flags`, registry hashes, current tool subset, and live handle references. This is not a naming change. It is a different execution contract.

Because the runtime is canonical first, the same backend can be projected outward in two directions. `/mcp` preserves compatibility. `/zcp` preserves the same business logic but changes discovery, calling discipline, result shape, and state handling. That is why ZCP can keep compatibility without forcing native clients to inherit all of the compatibility surface cost.

4. JSON Schema At The Edge, Not At The Center

ZCP does not literally delete JSON Schema. It still validates arguments and can still compile strict schemas for providers such as OpenAI. The key change is that JSON Schema stops being the primary native planning artifact.

In MCP, schema generation is upstream and central. `func_metadata(...)` builds Pydantic models, `model_json_schema()` becomes the tool contract, and the default `list_tools()` response exposes those schemas directly. In other words, the same rich schema object acts as registration metadata, transport payload, and the model-facing description.

In ZCP, schema becomes one field inside a richer canonical object. `ToolDefinition` still keeps `input_schema`, but the native runtime can reason in terms of tool ids, subsets, handles, and output modes. `OpenAIStrictSchemaCompiler` is then used at the adapter boundary to compile the currently selected `RegistryView` into provider-specific strict function tools only when that provider needs it.

That is the precise meaning of 'de-centering JSON Schema'. The schema is retained for validation and adapters, but it is no longer the sole object around which the whole runtime is organized.

5. How To Read The Figures And Tables

Figure 1 is an architecture boundary diagram. It shows where compatibility lives and where optimization lives. The point of that figure is to make clear that ZCP does not fork business logic; it forks the model-facing execution contract.

Figure 2 is a causal token diagram. It traces where token cost is created: full schema exposure, broad planning, and result replay on the MCP-compatible path; filtered discovery, staged planning, and compact result propagation on the native path.

Table 1 is a token-cost source map. It is not a benchmark table. It tells you which mechanism removes which cost. Table 2 is a code-level mapping between official MCP implementation files and ZCP implementation files. Tables 3 and 4 are empirical: they show the benchmark and the tier breakdown that follow from those architectural choices.

6. Causal Mechanism

The benchmark only makes sense if the following mechanism chain is true. Each step removes one class of prompt-visible waste.

Step 1. Discovery is narrowed before planning starts

MCP-style servers usually expose a flat tool inventory. ZCP lets the native client request `profile="semantic-workflow"` and also filter by `groups` and `stages`. The model therefore begins planning inside a smaller action space.

Step 2. Call policy matches discovery policy

A filtered `tools/list` is meaningless if `tools/call` can still invoke the whole registry. ZCP keeps `enforce_tool_visibility_on_call`, so the model cannot silently escape the current exposure policy.

Step 3. Schema compilation is delayed and scoped

In MCP, JSON Schema is generated at registration time and then travels with every tool definition. In ZCP, the OpenAI adapter compiles strict schemas only for the selected `RegistryView` and only when the provider requires them.

Step 4. Results stop replaying whole artifacts

The canonical runtime checks `output_mode`, `inline_ok`, and value size. Small values remain `scalar`; larger values become `HandleRef + summary`. That changes the next prompt turn from 'repeat the full object' to 'continue from a compact reference'.

Step 5. Long-running state leaves the prompt loop

Tasks, handles, progress, and status updates become runtime state. The model no longer needs to keep reconstructing partially completed work by re-reading large tool outputs or repeatedly polling generic tools.

Step 6. Semantic tools compress primitive plans

Once a server also offers workflow-level tools, the model no longer has to plan at the lowest possible mutation granularity. That is why the biggest gains appear in Tier B, C, and D rather than in one-shot Tier A calls.

7. Code-Level Comparison

This table compares the official MCP Python SDK implementation style with the local ZCP runtime implementation. The point is not rhetoric; it is where each design places state, schemas, and planning constraints.

Concern	MCP implementation	ZCP implementation	Token consequence
Primary contract object	`src/mcp/types/_types.py::Tool` centers the public contract on `input_schema`, `output_schema`, and `CallToolResult(content, structured_content)`.	`src/zcp/canonical_protocol.py::ToolDefinition` and `SessionState` center the runtime on tool ids, subset hashes, output modes, handles, defaults, flags, and metadata.	More state is held by the runtime instead of being reconstructed by the model every turn.
Schema generation	`src/mcp/server/mcpserver/tools/base.py::Tool.from_function` calls `arg_model.model_json_schema(by_alias=True)` at registration time.	`src/zcp/adapters/openai.py::compile_openai_tools` compiles strict schemas only for the selected `RegistryView`, and only when the adapter needs them.	The model is not forced to see the whole schema-bearing registry on every native turn.
Discovery	`src/mcp/server/mcpserver/tool_manager.py::list_tools()` returns the whole tool map; `src/mcp/server/mcpserver/server.py::list_tools()` serializes all tools with schemas.	`src/zcp/server.py::_select_tools(...)` filters by profile, groups, excludeGroups, and stages before returning the list.	Branch factor falls before planning begins.
Call discipline	`ToolManager.call_tool(...)` checks only that the name exists and then runs it.	`src/zcp/server.py::_tool_is_exposed(...)` plus `enforce_tool_visibility_on_call` keeps calls inside the active subset.	Filtered discovery does not widen back into a broad execution surface.
Result shape	`src/mcp/server/mcpserver/server.py::_handle_call_tool()` wraps outputs into `CallToolResult(content, structured_content)` and keeps those payloads prompt-visible.	`src/zcp/canonical_runtime.py::_build_result()` chooses `scalar` or `handle + summary` via `HandleStore`.	Later turns replay less payload.
Native model grammar	The public contract is schema-bearing JSON objects and content blocks.	`src/zcp/profiles/native.py::format_registry()` emits compact `TOOL @id alias(param:type) -> output_mode` lines.	Native planners can operate over compact signatures instead of full JSON Schema trees.
Long-running state	Tasks exist, but the generic tool surface still naturally gravitates toward prompt-visible `CallToolResult` loops.	`TaskManager`, `TaskExecutionContext`, progress notifications, and handle refs keep state durable and out of the prompt by default.	Repair loops and polling loops become smaller and less repetitive.

Table 1. Principle-Level Comparison

Token cost source	MCP default shape	ZCP countermeasure	Why it matters
Repeated tool-schema exposure	A broad `tools/list` returns full JSON Schema-bearing tool definitions.	Native discovery can return only the active profile/stage subset.	Fewer visible schemas means fewer prompt tokens and less planning entropy.
Schema as the planning surface	JSON Schema stays central from registration through transport.	JSON Schema is compiled only at the adapter edge from a selected registry view.	The runtime stops forcing the model to reason over the whole schema object graph.
Large result replay	Tool results commonly re-enter the next turn as content or structured content.	Large values become handles plus short summaries.	The next turn carries references instead of full artifacts.
Prompt-visible background state	Intermediate state tends to leak back into tool loops and explanations.	Tasks, handles, progress, and session state live in the runtime.	Long-running workflows stay smaller and more stable.
Discovery / execution mismatch	A model may list one surface and still wander to any registered tool.	Call visibility is checked against the active exposure policy.	The action space remains narrow after the first decision.

8. Key Code Snippets

These snippets are the shortest path to the real argument. They compare the official MCP code path to the local ZCP code path without relying on the Excel benchmark implementation itself.

MCP tool registration is schema-first

modelcontextprotocol/python-sdk/src/mcp/server/mcpserver/tools/base.py

The official MCP server path converts Python function metadata into a Pydantic model and immediately serializes it to JSON Schema. That schema becomes the tool contract.

class Tool(BaseModel):
    fn: Callable[..., Any] = Field(exclude=True)
    name: str = Field(description="Name of the tool")
    parameters: dict[str, Any] = Field(description="JSON schema for tool parameters")
    fn_metadata: FuncMetadata = Field(...)

    @classmethod
    def from_function(cls, fn: Callable[..., Any], ...):
        func_arg_metadata = func_metadata(fn, ...)
        parameters = func_arg_metadata.arg_model.model_json_schema(by_alias=True)
        return cls(
            fn=fn,
            name=func_name,
            parameters=parameters,
            fn_metadata=func_arg_metadata,
        )

MCP returns prompt-visible content objects

modelcontextprotocol/python-sdk/src/mcp/server/mcpserver/server.py

The default call path converts results into `CallToolResult(content, structured_content)`. This is correct for compatibility, but it keeps large results close to the prompt loop.

async def _handle_call_tool(self, ctx, params) -> CallToolResult:
    result = await self.call_tool(params.name, params.arguments or {}, context)

    if isinstance(result, CallToolResult):
        return result
    if isinstance(result, tuple) and len(result) == 2:
        unstructured_content, structured_content = result
        return CallToolResult(
            content=list(unstructured_content),
            structured_content=structured_content,
        )
    return CallToolResult(content=list(result))

ZCP canonical contract carries runtime state explicitly

zero-context-protocol-python/src/zcp/canonical_protocol.py

ZCP does not make schema disappear. It makes schema one field inside a richer runtime contract that also tracks subsets, handles, defaults, and output modes.

@dataclass
class ToolDefinition:
    tool_id: str
    alias: str
    description_short: str
    input_schema: dict[str, Any]
    output_schema: dict[str, Any] | None = None
    output_mode: Literal["handle", "scalar"] = "handle"
    handle_kind: str = "generic"
    defaults: dict[str, Any] = field(default_factory=dict)
    flags: frozenset[str] = field(default_factory=frozenset)
    metadata: dict[str, Any] = field(default_factory=dict)

@dataclass
class SessionState:
    session_id: str
    registry_hash: str = ""
    tool_subset: tuple[str, ...] = ()
    handles: dict[str, HandleRef] = field(default_factory=dict)

ZCP narrows discovery before the first turn

zero-context-protocol-python/src/zcp/server.py

Profile and stage filtering are runtime rules, not prompt conventions. The subset is enforced before the model plans.

def _select_tools(app: FastZCP, params: dict[str, Any]) -> list[Any]:
    tools = app.tool_registry.subset().tools
    profile = _effective_tool_profile(app, params)
    include_groups = _normalize_filter_values(params.get("groups"))
    stages = _normalize_filter_values(params.get("stages"))

    if profile == app.semantic_workflow_profile:
        workflow_tools = [tool for tool in tools if app.semantic_group in _tool_groups(tool)]
        if workflow_tools:
            tools = workflow_tools

    if include_groups:
        tools = [tool for tool in tools if _tool_groups(tool) & include_groups]
    if stages:
        tools = [tool for tool in tools if _tool_stages(tool) & stages]
    return tools

ZCP keeps JSON Schema at the adapter boundary

zero-context-protocol-python/src/zcp/adapters/openai.py

Strict JSON Schema is still available, but it is compiled from the current `RegistryView`, not treated as the permanent native planning surface.

def compile_openai_tools(self, session: SessionState, *, tool_subset=None, strict_mode=True):
    subset_tuple = tuple(tool_subset or ())
    registry_view = self.registry.subset(list(subset_tuple) if subset_tuple else None, limit=self.tool_limit)
    session.registry_hash = registry_view.hash
    session.tool_subset = subset_tuple

    if key not in self._tool_cache:
        tools = self.compiler.compile_registry(registry_view)
        self._tool_cache[key] = tools
    return self._tool_cache[key]

ZCP can present a compact native registry grammar

zero-context-protocol-python/src/zcp/profiles/native.py

The native profile compresses each tool to `id + alias + compact param types + output mode`. This is the clearest expression of schema de-centering.

def format_registry(tools: list[ToolDefinition]) -> str:
    entries = []
    for tool in tools:
        params = ",".join(
            f"{name}:{_compact_type(schema)}"
            for name, schema in tool.input_schema.get("properties", {}).items()
        )
        entries.append(f"TOOL @{tool.tool_id} {tool.alias}({params}) -> {tool.output_mode}")
    return "\n".join(entries)

ZCP compacts results into scalar or handle

zero-context-protocol-python/src/zcp/canonical_runtime.py

This is the second major token-saving mechanism after filtered discovery. Big results stop re-entering every subsequent turn.

if tool.output_mode == "scalar" and (tool.inline_ok or is_scalar_value(value)):
    return CallResult(
        cid=request.cid,
        status="ok",
        scalar=value,
        summary=summary,
        meta=meta,
    )

handle = self.handle_store.create(
    kind=handle_kind,
    data=value,
    summary=summary,
    meta=meta,
)
return CallResult(
    cid=request.cid,
    status="ok",
    handle=handle,
    summary=handle.summary,
    meta=meta,
)

Figure 2. Where The Token Savings Come From

The token gain is causal: smaller registry subset, tighter calling discipline, compact result propagation, and runtime-held state.

MCP schema-first surface

full tool list + full JSON Schema

broad planning over many branches

content / structured_content replay into later turns

ZCP canonical surface

profile-filtered subset + compact contract

planning inside a constrained subset

scalar inline, large values behind handles and task state

smaller next-turn context and fewer repair loops

Table 2. Overall Benchmark

Path	Answer	Workbook	Tool	Avg total tokens	Avg turns
`zcp_client_to_native_zcp`	100.0%	97.3%	100.0%	8027.9	2.8
`mcp_client_to_zcp_mcp_surface`	97.3%	91.9%	73.0%	30723.7	4.1

Table 3. Tier Breakdown

Tier	What changed structurally	Native ZCP	MCP surface	Advantage
A	Little room for planning policy to help	15979.4	17613.2	1.10x
B	Short chains collapse into semantic chain tools	1826.6	29239.4	16.01x
C	Workflow tools remove long primitive plans	2091.1	72113.9	34.49x
D	Autonomous planning gets the smallest search space	2018.3	19375.7	9.60x

9. Why The Tier Results Look Like This

Tier A

Small gain is expected

One-shot tool calls do not contain much planning waste. They are useful as a sanity check, but they should not be the headline for a runtime-efficiency claim.

Tier B

Semantic chains begin to matter

The first large jump appears when the model would otherwise need to plan across several tightly coupled primitive calls. Narrower discovery plus semantic chain tools reduce internal branching sharply.

Tier C

Workflow compression dominates

This tier proves the gain is not mostly wire-format trivia. The model is no longer planning every low-level mutation, so the savings become structural rather than incremental.

Tier D

Autonomous planning is the real stress test

Tier D is where broad surfaces typically explode into repair loops, repeated reads, and status churn. ZCP wins because the runtime constrains the search space and keeps state outside the prompt before those loops expand.

10. Limits And Scope

The `3.83x` headline is a published result on the current Excel workflow benchmark, not a universal theorem for every domain or model.
ZCP's largest gains depend on using the native runtime features that make schemas peripheral rather than central: profile-based discovery, handles, tasks, and semantic tools.
This report argues that ZCP has a stronger architectural position for model execution. It does not argue that MCP becomes useless for ecosystem interoperability.
The fairest formulation is therefore: MCP remains the compatibility contract; ZCP becomes the more efficient execution contract.

11. Conclusion

ZCP is stronger than MCP on planning-heavy workloads because it changes the model-facing execution contract, not because it changed the transport or rewrote the backend business logic.

The official MCP code path is schema-first and compatibility-first. The ZCP code path is canonical-runtime-first: schemas remain available, but they are compiled at the edge, while the native runtime is organized around subsets, handles, output modes, and task state.

That design directly explains the benchmark. Fewer tools are visible, less schema text is repeated, large payloads stop replaying into later turns, and long-running state stops leaking back into prompt-visible loops. The result is lower token use and lower planning entropy for the same backend logic.

Figure 1. Boundary Protocol Versus Native Runtime

1. Problem Statement

2. Why ZCP Uses Fewer Tokens

3. Canonical Runtime And Context Contract

4. JSON Schema At The Edge, Not At The Center

5. How To Read The Figures And Tables

6. Causal Mechanism

Step 1. Discovery is narrowed before planning starts

Step 2. Call policy matches discovery policy

Step 3. Schema compilation is delayed and scoped

Step 4. Results stop replaying whole artifacts

Step 5. Long-running state leaves the prompt loop

Step 6. Semantic tools compress primitive plans

7. Code-Level Comparison

Table 1. Principle-Level Comparison

8. Key Code Snippets

MCP tool registration is schema-first

MCP returns prompt-visible content objects

ZCP canonical contract carries runtime state explicitly

ZCP narrows discovery before the first turn

ZCP keeps JSON Schema at the adapter boundary

ZCP can present a compact native registry grammar

ZCP compacts results into scalar or handle

Figure 2. Where The Token Savings Come From

MCP schema-first surface

ZCP canonical surface

Table 2. Overall Benchmark

Table 3. Tier Breakdown

9. Why The Tier Results Look Like This

Small gain is expected

Semantic chains begin to matter

Workflow compression dominates

Autonomous planning is the real stress test

10. Limits And Scope

11. Conclusion

Read next