调用网关
流式响应(SSE)
通过 Server-Sent Events 在 OpenAI / Anthropic / Gemini 协议下增量输出 token。
三种协议都通过 Server-Sent Events 流式输出。设置对应开关后逐事件消费即可。
如何开启
| 协议 | 开关 |
|---|---|
| OpenAI Chat / Responses | JSON body 加 "stream": true |
| Anthropic | JSON body 加 "stream": true |
| Gemini | 用 :streamGenerateContent + ?alt=sse |
SSE 格式
事件以 data: 行 + 空行分隔。OpenAI 流以 data: [DONE] 结束;Anthropic、Gemini 用各自协议规定的结束事件。
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"He"}}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"llo"}}]}
data: [DONE]Python
from openai import OpenAI
client = OpenAI(api_key="sk-***", base_url="https://portal.torouter.ai/v1")
stream = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "数到 5"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)from anthropic import Anthropic
client = Anthropic(api_key="sk-***", base_url="https://portal.torouter.ai")
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=256,
messages=[{"role": "user", "content": "数到 5"}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)Node.js
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-***",
baseURL: "https://portal.torouter.ai/v1",
});
const stream = await client.chat.completions.create({
model: "gpt-5",
messages: [{ role: "user", content: "数到 5" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}推理模型(如 *-thinking、o1、o3)会在答案前输出推理片段。在支持的端点上,OpenAI SDK 会通过 delta.reasoning_content 给出这部分内容(如可用)。