Skip to content

Chat Completions API streaming

Set "stream": true on POST /v1/chat/completions to receive a standard OpenAI Server-Sent Events stream. Your existing parser works unchanged.

Request

Terminal window
curl -N https://api.kindo.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KINDO_API_KEY" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"stream": true,
"messages": [
{ "role": "user", "content": "Write a haiku about observability." }
]
}'

Kindo returns Content-Type: text/event-stream and emits OpenAI’s chunk format. Each event is a JSON payload prefixed with data: :

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Logs flow through"}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":" the gate;"}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]

The terminal data: [DONE] line is the canonical end-of-stream marker.

Event types

The stream emits unnamed data: lines. Each line is either an incremental chunk or the terminal marker:

EventMeaning
data:Incremental chat.completion.chunk JSON payload.
[DONE]End-of-stream marker.

Example stream

Each data: JSON object follows OpenAI’s chat.completion.chunk schema:

{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1710000000,
"model": "claude-sonnet-4-5-20250929",
"choices": [
{
"index": 0,
"delta": { "content": " the gate;" },
"finish_reason": null
}
]
}

usage is not included in intermediate chunks. In the OpenAI spec, stream_options.include_usage requests usage data in the final chunk; Kindo currently does not populate this field. If you need usage stats, make a non-streaming request or rely on Kindo’s metering.

SDK consumption

import os
from openai import OpenAI
client = OpenAI(base_url="https://api.kindo.ai/v1", api_key=os.environ["KINDO_API_KEY"])
stream = client.chat.completions.create(
model="claude-sonnet-4-5-20250929",
messages=[{"role": "user", "content": "Write a haiku."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.kindo.ai/v1',
apiKey: process.env.KINDO_API_KEY
});
const stream = await client.chat.completions.create({
model: 'claude-sonnet-4-5-20250929',
messages: [{ role: 'user', content: 'Write a haiku.' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0].delta.content ?? '');
}

Pre-stream errors

If the upstream request fails before streaming begins, Kindo returns a normal HTTP error response with the OpenAI-compatible JSON envelope — no SSE switch. See Errors.

Mid-stream errors

Once SSE has started, the outer HTTP status is committed at 200. Errors that arise after that point arrive as an inline event: error SSE frame followed by data: [DONE]:

event: error
data: {"error":{"message":"upstream stream interrupted","type":"server_error","code":"stream_error"}}
data: [DONE]

Treat any event: error payload with code: "stream_error" as terminal — the response will not resume.

See also