Messages API request shape

POST /v1/messages accepts the Anthropic Messages request body. Kindo validates the core fields and forwards the rest verbatim, which keeps the endpoint forward-compatible with new Anthropic fields.

Required fields

Field	Type	Notes
`model`	string	A model ID from `GET /v1/models`.
`messages`	array	Roles must be `user` or `assistant`, must alternate, and the first message must be `user`. Use the top-level `system` field for system prompts.
`max_tokens`	integer	Positive integer.

Standard fields

Field	Type	Forwarded?	Notes
`stream`	boolean	—	Consumed by the route handler to switch into SSE mode.
`system`	string \| array	Yes	Optional system prompt. Use array of content blocks to attach `cache_control` (see Prompt caching).
`tools`	array	Yes	Anthropic tool definitions (see Tool use).
`tool_choice`	object	Yes	Anthropic’s standard tool-choice format.
`temperature`	number	Yes	Sampling control.
`top_p`	number	Yes	Nucleus sampling.
`top_k`	integer	Yes	Top-K sampling.
`stop_sequences`	array	Yes	Stop sequences.

Anything not listed above passes through Kindo’s schema verbatim because the validator uses passthrough(). Refer to Anthropic’s Messages spec for full field semantics.

Anthropic-specific headers

Kindo forwards these headers to the upstream provider when present:

anthropic-version
anthropic-beta
x-claude-code-session-id

Anthropic SDKs and Claude Code set the required protocol headers for you. Raw HTTP clients should include anthropic-version.

Stripped-on-ingress fields

Kindo strips these fields from the outgoing upstream request:

metadata
litellm_metadata
proxy_server_request

Messages array

Field	Required	Notes
`role`	Yes	`user` or `assistant`. Must alternate, must start with `user`.
`content`	Yes	String or array of content blocks. Blocks include `text`, `tool_use`, `tool_result`, `image`, etc. Forwarded verbatim.

System prompts go in the top-level system field, not in the messages array.

Prompt caching

Block-level cache_control works through Kindo. Add cache_control to Anthropic content blocks and verify cache hits from the response usage object.

{
  "model": "claude-sonnet-4-5-20250929",
  "max_tokens": 512,
  "system": [
    {
      "type": "text",
      "text": "You are a release-notes assistant. Always answer in bullet points.",
      "cache_control": { "type": "ephemeral" }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Summarize the last three deploys for service api."
        }
      ]
    }
  ]
}

Look for these fields in usage:

cache_creation_input_tokens — tokens written to cache on this request.
cache_read_input_tokens — tokens read from cache on subsequent requests.

Known limitation

Top-level cache_control is not currently forwarded by LiteLLM’s Anthropic TypedDict layer. Use block-level cache_control instead.

If your client requires an Anthropic beta header for prompt caching, use a header value supported by your target model. Internal end-to-end verification confirmed anthropic-beta: prompt-caching-2024-07-31 works with Kindo’s proxy.

`count_tokens`

POST /v1/messages/count_tokens counts tokens for a Messages request without generating a completion.

curl https://api.kindo.ai/v1/messages/count_tokens \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KINDO_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5-20250929",
    "system": "You are a concise assistant.",
    "messages": [
      { "role": "user", "content": "Summarize the principle of least privilege." }
    ]
  }'

Field	Required for `count_tokens`	Notes
`model`	Yes
`messages`	Yes
`max_tokens`	Optional
`stream`	Not supported	Omit it or set it to `false`.

Response:

{ "input_tokens": 20 }

Kindo extension fields

/v1/messages is stock Anthropic — Kindo does not accept a kindo request block on this surface. Kindo’s opt-in extensions (curated system prompt, hosted tools, stateful conversations) are available on /v1/responses; see the Chat Actions guide.