Live

Context Engineering Studio

Six strategies, one task, streaming in parallel — context engineering made visible

View Live View Code

Built in One overnight session (March 3, 2026)

Next.jsAnthropic ClaudeContext EngineeringSSE StreamingTypeScriptAI ArchitectureParallel Requests

6 strategies streaming in parallel

TTFT per strategy

Token count per panel

Auto-detected winner

01The Problem

The phrase 'prompt engineering' has become synonymous with tricks and hacks — magic phrases, template libraries, cargo-culted patterns. The actual discipline is different: it's about what information goes into the context window, how it's structured, and how each element shapes the output distribution. I wanted to make that difference empirically visible — not as a claim, but as a live experiment you can run yourself.

02The Approach

Six API calls fire in parallel when you hit Run. Each implements a different context strategy via a different combination of system prompt and user message construction. All six stream independently to their panels. TTFT and token count are tracked per panel. After all streams complete, the 'Full Stack' panel (which combines all strategies) is highlighted as the winner — because it almost always is.

03Architecture Decisions

Parallel SSE streams from a single API route

The /api/stream route accepts a strategy name and constructs the appropriate system+user message combination. The frontend fires 6 simultaneous fetch calls (one per strategy) and routes each SSE stream to the correct panel via strategy-keyed state.

Strategy implementation as prompt construction

Each strategy is a different way of building the prompt, not a different model or temperature. Persona: 'You are a world-class expert.' Grounding: inject relevant facts before the task. Few-shot: prepend 2-3 examples in desired format. Constraints: explicit rules list. Full Stack: all of the above combined.

TTFT measurement at the token boundary

Time-to-first-token is recorded when the first content_block_delta event arrives from the Anthropic SSE stream. The done event includes total token count. This gives empirical data on which strategies add latency (almost none — it's all model generation time).

04Key Insight

The gap between baseline and full-stack scales with task complexity. For simple tasks (arithmetic, definitions), all strategies produce near-identical output — the baseline is fine. For complex tasks (architecture decisions, analysis, explanation), the gap is dramatic. The practical lesson: invest in context engineering proportionally to task difficulty.

05Why It Matters

The most timely demo in the portfolio — context engineering is the hottest concept in production AI (March 2026). Directly demonstrates the discipline that separates AI demos from production systems. Relevant for every AI engineering role: Clio (legal document context), Cohere (agentic context management), Giga (enterprise AI pipelines).

Back to projects

Live demo GitHub