Skip to content

Messages API request shape

POST /v1/messages accepts the Anthropic Messages request body. Kindo validates the core fields and forwards the rest verbatim, which keeps the endpoint forward-compatible with new Anthropic fields.

Required fields

FieldTypeNotes
modelstringA model ID from GET /v1/models.
messagesarrayRoles must be user or assistant, must alternate, and the first message must be user. Use the top-level system field for system prompts.
max_tokensintegerPositive integer.

Standard fields

FieldTypeForwarded?Notes
streambooleanConsumed by the route handler to switch into SSE mode.
systemstring | arrayYesOptional system prompt. Use array of content blocks to attach cache_control (see Prompt caching).
toolsarrayYesAnthropic tool definitions (see Tool use).
tool_choiceobjectYesAnthropic’s standard tool-choice format.
temperaturenumberYesSampling control.
top_pnumberYesNucleus sampling.
top_kintegerYesTop-K sampling.
stop_sequencesarrayYesStop sequences.

Anything not listed above passes through Kindo’s schema verbatim because the validator uses passthrough(). Refer to Anthropic’s Messages spec for full field semantics.

Anthropic-specific headers

Kindo forwards these headers to the upstream provider when present:

  • anthropic-version
  • anthropic-beta
  • x-claude-code-session-id

Anthropic SDKs and Claude Code set the required protocol headers for you. Raw HTTP clients should include anthropic-version.

Stripped-on-ingress fields

Kindo strips these fields from the outgoing upstream request:

  • metadata
  • litellm_metadata
  • proxy_server_request

Messages array

FieldRequiredNotes
roleYesuser or assistant. Must alternate, must start with user.
contentYesString or array of content blocks. Blocks include text, tool_use, tool_result, image, etc. Forwarded verbatim.

System prompts go in the top-level system field, not in the messages array.

Prompt caching

Block-level cache_control works through Kindo. Add cache_control to Anthropic content blocks and verify cache hits from the response usage object.

{
"model": "claude-sonnet-4-5-20250929",
"max_tokens": 512,
"system": [
{
"type": "text",
"text": "You are a release-notes assistant. Always answer in bullet points.",
"cache_control": { "type": "ephemeral" }
}
],
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Summarize the last three deploys for service api."
}
]
}
]
}

Look for these fields in usage:

  • cache_creation_input_tokens — tokens written to cache on this request.
  • cache_read_input_tokens — tokens read from cache on subsequent requests.

Known limitation

Top-level cache_control is not currently forwarded by LiteLLM’s Anthropic TypedDict layer. Use block-level cache_control instead.

If your client requires an Anthropic beta header for prompt caching, use a header value supported by your target model. Internal end-to-end verification confirmed anthropic-beta: prompt-caching-2024-07-31 works with Kindo’s proxy.

count_tokens

POST /v1/messages/count_tokens counts tokens for a Messages request without generating a completion.

Terminal window
curl https://api.kindo.ai/v1/messages/count_tokens \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KINDO_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-5-20250929",
"system": "You are a concise assistant.",
"messages": [
{ "role": "user", "content": "Summarize the principle of least privilege." }
]
}'
FieldRequired for count_tokensNotes
modelYes
messagesYes
max_tokensOptional
streamNot supportedOmit it or set it to false.

Response:

{ "input_tokens": 20 }

Kindo extension fields

/v1/messages is stock Anthropic — Kindo does not accept a kindo request block on this surface. Kindo’s opt-in extensions (curated system prompt, hosted tools, stateful conversations) are available on /v1/responses; see the Chat Actions guide.

See also