Prompt Lab
Four prompting techniques, one question, running simultaneously — see the difference live
4 techniques in parallel
Live TTFT measurement
Temperature control
Token counting per panel
01The Problem
Prompt engineering advice is almost always written in the abstract. 'Add chain-of-thought for better reasoning.' But how much better? For which tasks? I wanted an interactive tool that makes the comparison concrete — run any question through all four techniques at once and watch the outputs appear side by side.
02The Approach
Four parallel SSE connections, each carrying a different prompt variant to Claude Haiku. The frontend initiates all four streams simultaneously. TTFT is measured with performance.now() from request initiation to first token arrival.
03Architecture Decisions
Parallel streaming with independent SSE connections
Each technique variant has its own API route. Four requests go out simultaneously so the race between techniques is visible and real.
Technique-specific prompt construction
Zero-shot: just the question. Few-shot: three example Q&A pairs + the question. Chain-of-thought: question + 'Think step by step.' System-prompt: a full system prompt with role, format, and constraint specifications.
Temperature and model controls
Temperature slider (0.0–1.5) re-runs all four techniques simultaneously. Model selector toggles between Haiku and Sonnet.
04Key Insight
The side-by-side consistently reveals: zero-shot is fast and competent for simple queries; few-shot produces best tone consistency; chain-of-thought shines on multi-step reasoning but is verbose on lookups; system-prompt tuning produces the most predictable output at highest token cost. Match technique to task.
05Why It Matters
Makes prompt engineering intuitions concrete rather than theoretical. Most common reaction: 'I didn't realise chain-of-thought was that verbose for simple questions.'