Streaming Debugger
Inspect every byte of a live LLM stream.
When an LLM streams a response, it does not send plain text. It sends a series of Server-Sent Events (SSE) containing JSON deltas. The Streaming Debugger lets you watch these deltas in real time, decode metadata, and diagnose latency or truncation issues.
Why Streaming Matters
Without streaming, users stare at a blank screen for seconds — or minutes — while the model generates the full response. Streaming delivers tokens as they are generated, making the interface feel alive and responsive.
What the Debugger Shows
1. Raw SSE Chunks
Each chunk is a line starting with data:. The debugger shows the full JSON payload before any client-side parsing:
data: {"id":"msg_01AbC...","type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}
2. Parsed Metadata
For every provider, the debugger extracts:
| Field | Provider | Meaning |
|---|---|---|
input_tokens | Anthropic | Tokens in your prompt |
output_tokens | Anthropic | Tokens generated so far |
prompt_tokens | OpenAI | Tokens in your prompt |
completion_tokens | OpenAI | Tokens generated so far |
finish_reason | OpenAI | Why generation stopped |
usageMetadata | Gemini | Token counts and model version |
3. Timing Analysis
- TTFT (Time to First Token): Delay from request to first SSE chunk
- Inter-token Latency: Average milliseconds between chunks
- Total Duration: Wall-clock time from request to final chunk
Common Streaming Issues
Truncation
If the stream ends abruptly with a finish_reason of max_tokens or length, the model hit its output limit. The response is incomplete. Increase max_tokens or break the task into smaller prompts.
Rate Limiting
A 429 Too Many Requests error appears as a terminal SSE event from some providers, or as an HTTP error from others. The debugger highlights these in red and suggests backoff intervals.
Empty Deltas
Some providers send "heartbeat" chunks with no new text. The debugger filters these by default but can show them if you toggle "Show Empty Events."
How to Use It in AIWorkbench.dev
In the workbench, click the Debug toggle next to the output panel. The stream will split into two panes: the rendered Markdown on the left, and the raw SSE inspector on the right. You can pause the stream, copy individual chunks, or export the full trace as JSON.
Key Takeaway
Streaming is not magic — it is a structured data protocol. The debugger turns invisible network events into visible, actionable diagnostics. Use it to understand why a response feels slow, why it cut off, or why a provider rejected your request.