Streaming responses (SSE)

Stream tokens incrementally over Server-Sent Events for OpenAI, Anthropic and Gemini protocols.

All three protocols stream via Server-Sent Events. Set the appropriate flag and consume events as they arrive.

How to enable

Protocol	Flag
OpenAI Chat / Responses	`"stream": true` in the JSON body
Anthropic	`"stream": true` in the JSON body
Gemini	use `:streamGenerateContent` + `?alt=sse`

SSE format

Events arrive as data: lines separated by blank lines. The OpenAI stream terminates with data: [DONE]. Anthropic and Gemini use protocol-specific terminator events.

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"He"}}]}

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"llo"}}]}

data: [DONE]

Python

from openai import OpenAI

client = OpenAI(api_key="sk-***", base_url="https://portal.torouter.ai/v1")

stream = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Count to 5"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

from anthropic import Anthropic

client = Anthropic(api_key="sk-***", base_url="https://portal.torouter.ai")

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=256,
    messages=[{"role": "user", "content": "Count to 5"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js

stream.ts

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-***",
  baseURL: "https://portal.torouter.ai/v1",
});

const stream = await client.chat.completions.create({
  model: "gpt-5",
  messages: [{ role: "user", content: "Count to 5" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Reasoning models (e.g. *-thinking, o1, o3) emit reasoning content before the answer. Where supported, the OpenAI SDK surfaces this via delta.reasoning_content.

Next steps

OpenAI API

Full endpoint reference.

Anthropic API

/v1/messages details.

Error handling

Stream disconnects and retries.

Streaming responses (SSE)

Stream tokens incrementally over Server-Sent Events for OpenAI, Anthropic and Gemini protocols.

All three protocols stream via Server-Sent Events. Set the appropriate flag and consume events as they arrive.

How to enable

Protocol	Flag
OpenAI Chat / Responses	`"stream": true` in the JSON body
Anthropic	`"stream": true` in the JSON body
Gemini	use `:streamGenerateContent` + `?alt=sse`

SSE format

Events arrive as data: lines separated by blank lines. The OpenAI stream terminates with data: [DONE]. Anthropic and Gemini use protocol-specific terminator events.

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"He"}}]}

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"llo"}}]}

data: [DONE]

Python

from openai import OpenAI

client = OpenAI(api_key="sk-***", base_url="https://portal.torouter.ai/v1")

stream = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Count to 5"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

from anthropic import Anthropic

client = Anthropic(api_key="sk-***", base_url="https://portal.torouter.ai")

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=256,
    messages=[{"role": "user", "content": "Count to 5"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js

stream.ts

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-***",
  baseURL: "https://portal.torouter.ai/v1",
});

const stream = await client.chat.completions.create({
  model: "gpt-5",
  messages: [{ role: "user", content: "Count to 5" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Reasoning models (e.g. *-thinking, o1, o3) emit reasoning content before the answer. Where supported, the OpenAI SDK surfaces this via delta.reasoning_content.

How to enable

SSE format

Python

Node.js

Next steps

OpenAI API

Anthropic API

Error handling

Table of Contents

Streaming responses (SSE)

How to enable

SSE format

Python

Node.js

Next steps

OpenAI API

Anthropic API

Error handling

Table of Contents