Live

AI Tool Use Demo

The agentic loop made visible — tool calls, results, and synthesis in one UI

View Live View Code

Built in One overnight session

Next.jsAnthropic ClaudeTool UseAgentsTypeScript

5 real tools

Multi-turn agentic loop

Full step visibility

JSON schema tool definitions

01The Problem

Function calling / tool use is the foundational pattern in production AI agents — but most demos either hide it completely or show it as a static diagram. I wanted to build a chat interface where you can watch every step: which tools the AI decides to call, what inputs it sends, what the tools return, and how the model synthesizes it all into a final answer.

02The Approach

The API route implements a real multi-turn agentic loop: initial Claude call → if tool_use blocks detected, execute each tool server-side → return tool results to Claude → repeat until stop_reason is 'end_turn'. The entire trace (tool calls, tool results, final response) is returned to the frontend and rendered as a step-by-step timeline. Tool calls and results are collapsible so you can inspect exact inputs/outputs.

03Architecture Decisions

Real agentic loop — not a single call

The server loop runs up to 5 iterations. Each iteration: send messages to Claude, check for tool_use blocks, execute tools, append results as tool_result blocks, repeat. This handles multi-step reasoning naturally — the model can call get_weather, then compare results, then call get_exchange_rate to convert costs.

Tool definitions as JSON schema

Each tool has a name, description, and input_schema (JSON Schema format). The description is crucial — it tells the model not just what the tool does, but when to use it. Vague descriptions lead to wrong tool selection; specific descriptions with examples lead to reliable behavior.

Visible trace UI

Tool calls render as color-coded blocks with the tool name and inputs. Tool results render as expandable key-value panels. The final response appears after all tool steps. This makes the AI's reasoning process legible — you can see exactly why it called which tool and what it got back.

04Key Insight

Tool description quality determines reliability. The model uses descriptions (not just schemas) to decide which tool to call. 'Get weather' is too vague — 'Get the current weather for any city; use this for any weather-related question' produces reliable tool selection. This is the prompt engineering layer that most demos skip.

05Why It Matters

Demonstrates the core pattern in every production AI agent: tool definitions → model reasoning → tool execution → synthesis. Directly maps to Clio's enterprise AI (calling legal databases, billing systems, document stores). The same loop powers AI assistants, customer service bots, coding agents, and research tools.

Back to projects

Live demo GitHub