Messages API request shape
POST /v1/messages accepts the Anthropic Messages request body.
Kindo validates the core fields and forwards the rest verbatim,
which keeps the endpoint forward-compatible with new Anthropic
fields.
Required fields
| Field | Type | Notes |
|---|---|---|
model | string | A model ID from GET /v1/models. |
messages | array | Roles must be user or assistant, must alternate, and the first message must be user. Use the top-level system field for system prompts. |
max_tokens | integer | Positive integer. |
Standard fields
| Field | Type | Forwarded? | Notes |
|---|---|---|---|
stream | boolean | — | Consumed by the route handler to switch into SSE mode. |
system | string | array | Yes | Optional system prompt. Use array of content blocks to attach cache_control (see Prompt caching). |
tools | array | Yes | Anthropic tool definitions (see Tool use). |
tool_choice | object | Yes | Anthropic’s standard tool-choice format. |
temperature | number | Yes | Sampling control. |
top_p | number | Yes | Nucleus sampling. |
top_k | integer | Yes | Top-K sampling. |
stop_sequences | array | Yes | Stop sequences. |
Anything not listed above passes through Kindo’s schema verbatim
because the validator uses passthrough(). Refer to Anthropic’s
Messages spec for full
field semantics.
Anthropic-specific headers
Kindo forwards these headers to the upstream provider when present:
anthropic-versionanthropic-betax-claude-code-session-id
Anthropic SDKs and Claude Code set the required protocol headers
for you. Raw HTTP clients should include anthropic-version.
Stripped-on-ingress fields
Kindo strips these fields from the outgoing upstream request:
metadatalitellm_metadataproxy_server_request
Messages array
| Field | Required | Notes |
|---|---|---|
role | Yes | user or assistant. Must alternate, must start with user. |
content | Yes | String or array of content blocks. Blocks include text, tool_use, tool_result, image, etc. Forwarded verbatim. |
System prompts go in the top-level system field, not in the
messages array.
Prompt caching
Block-level cache_control works through Kindo. Add
cache_control to Anthropic content blocks and verify cache hits
from the response usage object.
{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "system": [ { "type": "text", "text": "You are a release-notes assistant. Always answer in bullet points.", "cache_control": { "type": "ephemeral" } } ], "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Summarize the last three deploys for service api." } ] } ]}Look for these fields in usage:
cache_creation_input_tokens— tokens written to cache on this request.cache_read_input_tokens— tokens read from cache on subsequent requests.
Known limitation
Top-level cache_control is not currently forwarded by LiteLLM’s
Anthropic TypedDict layer. Use block-level cache_control
instead.
If your client requires an Anthropic beta header for prompt caching,
use a header value supported by your target model. Internal
end-to-end verification confirmed
anthropic-beta: prompt-caching-2024-07-31 works with Kindo’s
proxy.
count_tokens
POST /v1/messages/count_tokens counts tokens for a Messages
request without generating a completion.
curl https://api.kindo.ai/v1/messages/count_tokens \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $KINDO_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5-20250929", "system": "You are a concise assistant.", "messages": [ { "role": "user", "content": "Summarize the principle of least privilege." } ] }'| Field | Required for count_tokens | Notes |
|---|---|---|
model | Yes | |
messages | Yes | |
max_tokens | Optional | |
stream | Not supported | Omit it or set it to false. |
Response:
{ "input_tokens": 20 }Kindo extension fields
/v1/messages is stock Anthropic — Kindo does not accept a kindo
request block on this surface. Kindo’s opt-in extensions (curated
system prompt, hosted tools, stateful conversations) are available
on /v1/responses; see the
Chat Actions guide.
See also
- Quickstart — minimal client setup.
- Streaming — SSE event types.
- Tool use —
tool_use/tool_resultround trip. - Errors — Anthropic error envelope.