I Built a Live MCP Demo That Shows Every Protocol Message in Real Time

I've written about the Model Context Protocol before. Today I shipped a live demo that makes the protocol itself visible — not just the AI using tools, but every JSON-RPC message, every tool call parameter, every raw result, every token cost. Here's what I built and why transparency was the whole point.

The problem with MCP demos

Every AI demo that uses tools looks the same from the outside: the user asks a question, the AI responds. Whether the model is hallucinating, calling a tool correctly, or chaining five tool calls to build a multi-step answer — it all looks identical in a chat interface.

This is a particular problem for the Model Context Protocol, which is increasingly central to enterprise AI architecture. The whole point of MCP is that it standardizes how AI applications discover and use external tools. But if you can't see the protocol messages, you can't evaluate whether someone actually understands it.

I wanted to build a demo that makes the protocol learnable and evaluable — not just impressive-looking.

The architecture: stream trace events alongside chat

The key technical challenge: how do you show protocol events in real time while the chat response is still generating?

I solved this with a line-delimited JSON event stream. The chat API route sends a ReadableStream where each line is either a trace event or a text chunk:

// Every line in the stream is one of these types:
{"type": "trace", "event": {"type": "initialize", "request": {...}, "response": {...}}}
{"type": "trace", "event": {"type": "tools_list", "tools": [...]}}
{"type": "trace", "event": {"type": "tool_call", "name": "calculate", "arguments": {...}}}
{"type": "trace", "event": {"type": "tool_result", "name": "calculate", "result": {...}}}
{"type": "trace", "event": {"type": "assistant_message", "inputTokens": 847, "outputTokens": 42}}
{"type": "text", "content": "The answer is "}
{"type": "text", "content": "175,500. "}
{"type": "done"}

The client reads each line, parses the JSON, and routes it: trace events go to the protocol panel, text chunks go to the chat display. This multiplexing lets both panels update simultaneously — the trace shows the tool call completing while the response text is still streaming in.

What the protocol trace shows

The trace panel reveals the full MCP session lifecycle:

Initialize. The client sends a handshake announcing its protocolVersion (2024-11-05) and capabilities. The server responds with its serverInfo and what capabilities it supports. Expandable in the UI to show the full JSON-RPC request and response bodies.

tools/list. The client requests all available tools. The response includes the complete tool definitions — name, description, JSON Schema input schema. The demo displays these as colored badges (each tool has a distinct color), expandable to show the full description and schema.

tools/call. When Claude decides to use a tool, the UI shows the request (tool name, parameters) and then the result (raw tool output, execution time, whether it errored). The call and result appear as separate events so you can see the timing — how long did the tool take? What did it return before the AI processed it?

Token usage. After each LLM call, the trace shows input and output token counts. This makes the cost of multi-step tool chains visible: a question that triggers three tool calls has three assistant messages, each with its own token cost.

Tool description quality is the whole game

Building this demo surfaced something I keep relearning: the quality of tool descriptions determines the quality of agentic behavior more than almost anything else.

For the calculate tool, the first draft description was: “evaluates mathematical expressions.” Claude would call it for vague questions like “roughly how much is that?” — appropriate for exact arithmetic, wrong for estimation.

The final description: “evaluates mathematical expressions; use when exact arithmetic is needed rather than approximate reasoning. Do NOT use for estimation or rough comparisons.” With that addition, Claude correctly distinguishes between “what is 149,000 + 202,000 divided by 2?” (tool call) and “is that a lot of money?” (direct answer).

For the search_knowledge_base tool, the description needed to specify when to use it: “Use when questions are asked about Harrison, his work, or his experience.” Without that scoping, Claude would sometimes answer questions about Harrison from training data rather than calling the tool — producing generic answers instead of specific ones.

The tool description is the interface between human intent and model behavior. It's worth more time than most engineers give it.

The 5 tools

The demo ships with five tools chosen to demonstrate different tool categories:

get_current_datetime — Real-time data that Claude can't know from training. Tests that the model reaches for the tool instead of guessing or making up a date.
search_knowledge_base — Retrieval from a curated knowledge base about my background, projects, and research. Tests selective retrieval with category filtering.
calculate — Exact arithmetic. Tests precision-routing behavior — the model should use the tool for math it could almost get right on its own, not just for complex calculations.
analyze_text — Heuristic sentiment and tone analysis. Tests tool use for computation that's better framed as a deterministic analysis than a language model judgment.
get_protocol_info — Meta: the tool returns reference information about MCP itself. Asking the demo “what is MCP?” triggers this tool, which makes the protocol explain itself from within the demo.

Multi-step chaining works naturally

The agentic loop handles arbitrary tool chains without pre-programming the sequence. A question like “What time is it right now, and what's the midpoint between Harrison's two target salaries?” produces:

1. tools/call: get_current_datetime {format: "full"}
   → "Wednesday, February 25, 2026 at 07:42 UTC"

2. tools/call: search_knowledge_base {query: "target salary", category: "job_search"}
   → "Clio: $149-202K CAD..."

3. tools/call: calculate {expression: "(149000 + 202000) / 2"}
   → "175,500"

4. assistant: "It's 07:42 UTC on Wednesday...the midpoint between $149K and $202K is $175,500."

Three tools, one question, correct chain, no pre-programming. The loop simply continues until Claude decides it has enough information for a final answer.

Why this demo matters for enterprise AI

Clio's Senior Developer, Enterprise AI job description lists MCP server configuration as an explicit requirement. Most candidates will talk about MCP. Fewer will have built an MCP server. Almost none will have built a demo that makes the protocol itself the feature.

But beyond the portfolio use case: the architecture of this demo — streaming multiplexed event types, tool definition quality, agentic loop design — is the actual architecture of production MCP systems. Legal document processing, enterprise knowledge retrieval, compliance checking — these systems all use the same patterns at larger scale. Making them visible in a demo is making them learnable.

Full implementation at github.com/matua-agent/mcp-server-demo. Live demo at mcp-server-demo-nu.vercel.app.