Building MCP Servers: How I Gave My AI Agent Real Tools

Over the past few months I've built several MCP servers and wired them into a production AI agent that I use every day. Here's what I learned — and why the Model Context Protocol is the most important piece of infrastructure in the current AI stack.

What MCP actually is

The Model Context Protocol is Anthropic's answer to a simple question: how do you give an LLM access to tools and data without writing a custom integration for every combination of model and capability?

Before MCP, every AI agent framework invented its own tool calling convention. LangChain had one way, LlamaIndex had another, raw OpenAI function calling had a third. If you wanted to share a “GitHub search” tool between two different agents or two different models, you were doing manual work.

MCP defines a standard: a server exposes resources (data), tools (actions), and prompts (templated interactions) over a JSON-RPC connection. Any MCP-compatible client can connect. The model doesn't need to know how GitHub works — it just calls the tool, and the server handles the rest.

Think of it like USB. Before USB, every peripheral needed its own connector. After USB, you plug in once and the protocol handles device negotiation. MCP is USB for AI capabilities.

The production setup

My AI agent (Matua) runs on a Hetzner VPS, connected to my infrastructure via Tailscale. It talks to Claude through the Anthropic API. The MCP servers I've built extend what it can do:

Filesystem MCP — read/write access to the workspace, with path restrictions to prevent the agent from wandering where it shouldn't
GitHub MCP — create branches, read file trees, check CI status, merge PRs
Notion MCP — query task boards, update statuses, create new tasks from natural language
Browser MCP — Playwright-based browser control for anything that requires a real browser session
Vercel MCP — deployment status, environment variable management, deployment triggers
Next.js DevTools MCP — live runtime errors, route structure, server logs from the running dev server

When the agent decides it needs to check CI status on a PR, it doesn't know how GitHub works — it calls github.get_pull_request_status and gets back a structured response. The implementation detail is entirely in the MCP server. The agent just uses capabilities.

Building your first MCP server

The official TypeScript SDK makes this straightforward. Here's a minimal server that exposes a Notion task board as a queryable tool:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "notion-tasks",
  version: "1.0.0",
});

server.tool(
  "query_tasks",
  "Query tasks from the Notion board by status",
  {
    status: z.enum(["To Do", "In Progress", "Done"]),
    limit: z.number().optional().default(10),
  },
  async ({ status, limit }) => {
    const response = await fetch(`https://api.notion.com/v1/databases/${DB_ID}/query`, {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.NOTION_API_KEY}`,
        "Notion-Version": "2022-06-28",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        filter: { property: "Status", select: { equals: status } },
        page_size: limit,
      }),
    });

    const data = await response.json();
    const tasks = data.results.map((r: NotionPage) => ({
      id: r.id,
      title: r.properties.Task.title[0]?.plain_text,
      priority: r.properties.Priority?.select?.name,
    }));

    return {
      content: [{ type: "text", text: JSON.stringify(tasks, null, 2) }],
    };
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

That's a complete, working MCP server. When connected to Claude, the model can call query_tasks with the appropriate status and get back structured task data. The model decides when to call it; the server handles the implementation.

Three things I learned the hard way

1. Tool descriptions are load-bearing

The tool description string is not documentation for humans — it's the model's only guidance on when and how to use the tool. Bad descriptions lead to the model either never calling the tool (doesn't know it applies) or calling it incorrectly (wrong parameters, wrong context).

I rewrote my read_file tool description three times before it started being used correctly. The original was: “Read a file from the filesystem.” The final version: “Read the complete contents of a file at the given path. Use this before editing any file to understand its current state. Supports text files; binary files return an error.” The addition of the use-case guidance (“before editing any file”) changed the behavior significantly.

2. Resource boundaries matter for trust

MCP servers can be given very broad or very narrow access. My filesystem server originally had write access to the entire VPS. After an agent accidentally wrote a temporary file to /etc/ (a path it shouldn't have been near), I added a workspace root restriction — every path is validated against an allowlist before any operation.

The principle I now use: grant the minimum access surface that lets the agent accomplish its task. You can always add more. You can't un-write a file.

3. Structured errors beat silent failures

When an MCP tool fails, the model receives the error as part of the tool response. If you return an opaque error like “Request failed”, the model will retry or hallucinate around it. If you return “GitHub API returned 403: this token lacks the repo scope for private repositories”, the model can tell the user exactly what went wrong and suggest how to fix it.

Good MCP tool error messages should answer: what happened, why it happened, and what to do next. This is the same principle as good API error design — except the consumer is a language model, not a developer.

The ecosystem today

The MCP server ecosystem is growing fast. Anthropic maintains reference implementations for Brave Search, filesystem access, Git, and a handful of other common integrations. The community has built hundreds more. Claude Desktop, Cursor, and most major AI coding tools now support MCP natively.

What's missing is tooling for enterprise deployment patterns — authentication, access controls at the server level, audit logging, and multi-tenant server architectures. If you're building for production use in a company environment, you're still rolling your own on much of this.

That will change. The protocol is solid. The infrastructure around it is catching up.

Why it matters

The thing that changed for me when I started using MCP properly: my agent went from a capable chat interface to something that could actually do autonomous work. Not because the model got smarter (it didn't), but because the surface area of actions it could take expanded dramatically.

An agent that can read files, query databases, call APIs, and browser-automate is categorically more capable than one that can only generate text. MCP is the mechanism that makes the first kind of agent composable and maintainable — instead of bespoke.

If you're building AI-powered products in 2026 and haven't looked at MCP yet, it's worth an afternoon. Build a server for a tool you use every day, connect it to Claude, and watch what happens when the model can actually reach out and touch your systems.