Streaming Debugger

Inspect every byte of a live LLM stream.

When an LLM streams a response, it does not send plain text. It sends a series of Server-Sent Events (SSE) containing JSON deltas. The Streaming Debugger lets you watch these deltas in real time, decode metadata, and diagnose latency or truncation issues.

Why Streaming Matters

Without streaming, users stare at a blank screen for seconds — or minutes — while the model generates the full response. Streaming delivers tokens as they are generated, making the interface feel alive and responsive.

What the Debugger Shows

1. Raw SSE Chunks

Each chunk is a line starting with data:. The debugger shows the full JSON payload before any client-side parsing:

data: {"id":"msg_01AbC...","type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}

2. Parsed Metadata

For every provider, the debugger extracts:

Field	Provider	Meaning
`input_tokens`	Anthropic	Tokens in your prompt
`output_tokens`	Anthropic	Tokens generated so far
`prompt_tokens`	OpenAI	Tokens in your prompt
`completion_tokens`	OpenAI	Tokens generated so far
`finish_reason`	OpenAI	Why generation stopped
`usageMetadata`	Gemini	Token counts and model version

3. Timing Analysis

TTFT (Time to First Token): Delay from request to first SSE chunk
Inter-token Latency: Average milliseconds between chunks
Total Duration: Wall-clock time from request to final chunk

Common Streaming Issues

Truncation

If the stream ends abruptly with a finish_reason of max_tokens or length, the model hit its output limit. The response is incomplete. Increase max_tokens or break the task into smaller prompts.

Rate Limiting

A 429 Too Many Requests error appears as a terminal SSE event from some providers, or as an HTTP error from others. The debugger highlights these in red and suggests backoff intervals.

Empty Deltas

Some providers send "heartbeat" chunks with no new text. The debugger filters these by default but can show them if you toggle "Show Empty Events."

How to Use It in AIWorkbench.dev

In the workbench, click the Debug toggle next to the output panel. The stream will split into two panes: the rendered Markdown on the left, and the raw SSE inspector on the right. You can pause the stream, copy individual chunks, or export the full trace as JSON.

Key Takeaway

Streaming is not magic — it is a structured data protocol. The debugger turns invisible network events into visible, actionable diagnostics. Use it to understand why a response feels slow, why it cut off, or why a provider rejected your request.