Chat Completions API streaming
Set "stream": true on POST /v1/chat/completions to receive a
standard OpenAI Server-Sent Events stream. Your existing parser
works unchanged.
Request
curl -N https://api.kindo.ai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $KINDO_API_KEY" \ -d '{ "model": "claude-sonnet-4-5-20250929", "stream": true, "messages": [ { "role": "user", "content": "Write a haiku about observability." } ] }'Kindo returns Content-Type: text/event-stream and emits OpenAI’s
chunk format. Each event is a JSON payload prefixed with data: :
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Logs flow through"}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":" the gate;"}}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]The terminal data: [DONE] line is the canonical end-of-stream
marker.
Event types
The stream emits unnamed data: lines. Each line is either an incremental chunk or the terminal marker:
| Event | Meaning |
|---|---|
data: | Incremental chat.completion.chunk JSON payload. |
[DONE] | End-of-stream marker. |
Example stream
Each data: JSON object follows OpenAI’s chat.completion.chunk
schema:
{ "id": "chatcmpl-abc123", "object": "chat.completion.chunk", "created": 1710000000, "model": "claude-sonnet-4-5-20250929", "choices": [ { "index": 0, "delta": { "content": " the gate;" }, "finish_reason": null } ]}usage is not included in intermediate chunks. In the OpenAI spec,
stream_options.include_usage requests usage data in the final
chunk; Kindo currently does not populate this field. If you need
usage stats, make a non-streaming request or rely on Kindo’s metering.
SDK consumption
import osfrom openai import OpenAI
client = OpenAI(base_url="https://api.kindo.ai/v1", api_key=os.environ["KINDO_API_KEY"])
stream = client.chat.completions.create( model="claude-sonnet-4-5-20250929", messages=[{"role": "user", "content": "Write a haiku."}], stream=True,)
for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)import OpenAI from 'openai';
const client = new OpenAI({ baseURL: 'https://api.kindo.ai/v1', apiKey: process.env.KINDO_API_KEY});
const stream = await client.chat.completions.create({ model: 'claude-sonnet-4-5-20250929', messages: [{ role: 'user', content: 'Write a haiku.' }], stream: true});
for await (const chunk of stream) { process.stdout.write(chunk.choices[0].delta.content ?? '');}Pre-stream errors
If the upstream request fails before streaming begins, Kindo returns a normal HTTP error response with the OpenAI-compatible JSON envelope — no SSE switch. See Errors.
Mid-stream errors
Once SSE has started, the outer HTTP status is committed at 200.
Errors that arise after that point arrive as an inline event: error
SSE frame followed by data: [DONE]:
event: errordata: {"error":{"message":"upstream stream interrupted","type":"server_error","code":"stream_error"}}
data: [DONE]Treat any event: error payload with code: "stream_error" as
terminal — the response will not resume.
See also
- Quickstart — non-streaming round trip.
- Request shape — every field Kindo honors.
- Tool use — streaming tool-call events.
- Errors — pre-stream error envelopes.