POST
Chat(流式返回)
POST /v1/chat/completions — 流式聊天完成(SSE)
Chat(流式返回)
使用 Server-Sent Events (SSE) 流式返回聊天结果,实现打字机效果,适合实时交互场景。
POST
https://api000.com/v1/chat/completions
与普通 Chat 接口完全相同,仅需添加 "stream": true 参数。
请求示例
cURL
curl https://api000.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-xxxxxxxxxxxxxxxx" \
-d '{
"model": "gpt-4o",
"stream": true,
"messages": [
{"role": "user", "content": "给我讲一个关于程序员的笑话"}
]
}'
Python(推荐写法)
from openai import OpenAI
client = OpenAI(
base_url="https://api000.com/v1",
api_key="sk-xxxxxxxxxxxxxxxx"
)
stream = client.chat.completions.create(
model="gpt-4o",
stream=True,
messages=[{"role": "user", "content": "给我讲一个关于程序员的笑话"}]
)
# 逐块打印,实现打字机效果
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content is not None:
print(delta.content, end="", flush=True)
print() # 最后换行
Node.js
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api000.com/v1",
apiKey: "sk-xxxxxxxxxxxxxxxx",
});
const stream = await client.chat.completions.create({
model: "gpt-4o",
stream: true,
messages: [{ role: "user", content: "给我讲一个笑话" }],
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}
流式响应格式(SSE)
服务器会持续推送 data: 开头的事件数据,每行一个 JSON 块:
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"从"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"前"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4o","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
流式 chunk 字段说明
| 字段 | 说明 |
|---|---|
choices[].delta.content |
当前块的文字内容(可能为空字符串) |
choices[].delta.role |
仅第一个 chunk 中包含,值为 "assistant" |
choices[].finish_reason |
正常为 null,结束时为 "stop" 等值 |
data: [DONE] |
流结束标志,收到此消息后停止读取 |
stream_options(获取 usage 统计)
流式模式默认不返回 Token 用量。添加 stream_options 可在最后一个 chunk 中获取:
stream = client.chat.completions.create(
model="gpt-4o",
stream=True,
stream_options={"include_usage": True}, # 开启 usage 统计
messages=[{"role": "user", "content": "你好"}]
)
for chunk in stream:
if chunk.usage: # 最后一个 chunk 包含 usage
print(f"总 Token: {chunk.usage.total_tokens}")