All projects
Live

Legal Matter AI Workflow

Matter in, intelligence out — a 3-stage AI workflow that mirrors Clio's product thesis

Built in One overnight session
Next.jsAnthropic ClaudeStreamingLegal TechMulti-stage PipelineTypeScript

3-stage pipeline

5 parallel API calls

Structured JSON extraction

Real-time SSE streaming

01The Problem

Legal professionals generate enormous amounts of unstructured text — matter descriptions, activity logs, correspondence, case notes. Converting that into the structured, formatted outputs lawyers actually need (client updates, billing narratives, risk summaries) is time-consuming and pattern-driven. It's exactly the kind of work AI handles well — if the pipeline is designed correctly.

02The Approach

Built a 3-stage pipeline with a clear separation of concerns: parse first (structured extraction via JSON schema), then analyze (strategic reasoning via streaming), then generate (multiple specialized outputs running in parallel). Each stage has its own API route, its own Claude prompt, and its own streaming or structured-output contract. The client shows live progress through all three stages.

03Architecture Decisions

Structured extraction in Stage 1

The parse route uses Claude Haiku with a tight JSON schema prompt — no tool use, no structured output API, just a system prompt that specifies the exact response format and 'respond ONLY with valid JSON.' The result is a typed MatterData object: matterType, client, opposingParty, keyFacts[], legalIssues[], urgency, stage, nextSteps[]. This structured object then powers all subsequent stages.

SSE streaming in Stage 2

The analysis route uses Anthropic's streaming API with server-sent events. The client receives deltas as they arrive and appends them to the UI in real time. The stage prompt instructs Claude to produce exactly 3 investigation angles with a specific markdown structure, so the streaming output is consistently parseable and visually organized.

Parallel output generation in Stage 3

Three API calls run simultaneously via Promise.all: client update (professional, first-person plural), billing narrative (activity-based, invoice-ready), and risk assessment (internal partner review). Each uses a different system prompt, different token budget, and different tone. The UI renders each output as it arrives, so all three stream in parallel rather than sequentially.

Matter-aware prompting

Each downstream stage receives the structured MatterData object, not the raw description. This forces precision — the billing narrative knows the legal issues array, the risk assessment knows the urgency level and stage. Prompts are grounded in structured facts, not a blob of text. This is the 'matter-aware' pattern that reduces hallucination and improves output format compliance.

04Key Insight

The parallel output generation in Stage 3 is the key architectural decision. Running three specialized models simultaneously rather than one general model sequentially produces better results at lower latency. Each output gets a system prompt written specifically for its format and audience — the client update and the billing narrative are completely different communication products and require different language. One prompt cannot optimize for both.

05Why It Matters

Directly mirrors Clio's AI product thesis: unstructured input → structured extraction → multiple derived workflow outputs. Their court document → calendar event feature uses the same pattern (extract → validate → write). Their matter activity → client update feature is literally one of the three output stages in this demo. This isn't a coincidence — it's the pattern that makes legal AI practical.