零度API 文档
POST

Chat(流式返回)

POST /v1/chat/completions — 流式聊天完成(SSE)

Chat(流式返回)

使用 Server-Sent Events (SSE) 流式返回聊天结果,实现打字机效果,适合实时交互场景。

POST https://api000.com/v1/chat/completions

与普通 Chat 接口完全相同,仅需添加 "stream": true 参数


请求示例

cURL

curl https://api000.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxxxxxxxx" \
  -d '{
    "model": "gpt-4o",
    "stream": true,
    "messages": [
      {"role": "user", "content": "给我讲一个关于程序员的笑话"}
    ]
  }'

Python(推荐写法)

from openai import OpenAI

client = OpenAI(
    base_url="https://api000.com/v1",
    api_key="sk-xxxxxxxxxxxxxxxx"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    stream=True,
    messages=[{"role": "user", "content": "给我讲一个关于程序员的笑话"}]
)

# 逐块打印,实现打字机效果
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content is not None:
        print(delta.content, end="", flush=True)

print()  # 最后换行

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api000.com/v1",
  apiKey: "sk-xxxxxxxxxxxxxxxx",
});

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  stream: true,
  messages: [{ role: "user", content: "给我讲一个笑话" }],
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}

流式响应格式(SSE)

服务器会持续推送 data: 开头的事件数据,每行一个 JSON 块:

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"从"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"前"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

流式 chunk 字段说明

字段 说明
choices[].delta.content 当前块的文字内容(可能为空字符串)
choices[].delta.role 仅第一个 chunk 中包含,值为 "assistant"
choices[].finish_reason 正常为 null,结束时为 "stop" 等值
data: [DONE] 流结束标志,收到此消息后停止读取

stream_options(获取 usage 统计)

流式模式默认不返回 Token 用量。添加 stream_options 可在最后一个 chunk 中获取:

stream = client.chat.completions.create(
    model="gpt-4o",
    stream=True,
    stream_options={"include_usage": True},  # 开启 usage 统计
    messages=[{"role": "user", "content": "你好"}]
)

for chunk in stream:
    if chunk.usage:  # 最后一个 chunk 包含 usage
        print(f"总 Token: {chunk.usage.total_tokens}")
零度API 文档