Messages API
The Kindo Messages API provides Anthropic-compatible /v1/messages and /v1/messages/count_tokens endpoints on api.kindo.ai. Use it when you want Claude Code or Anthropic SDKs to route through Kindo’s governance pipeline instead of calling Anthropic directly.
With the Messages API, requests still use Anthropic’s native request and response shapes, while Kindo adds API-key auth, model access control, audit logging, DLP enforcement, and credit metering.
Why use the Messages API
- Anthropic-compatible — Works with Claude Code, the Anthropic Python SDK, the Anthropic TypeScript SDK, and raw HTTP clients.
- Governed by Kindo — Requests pass through Kindo auth, model access checks, DLP, and usage metering.
- Supports Anthropic features — Streaming, tool use, and block-level prompt caching work with Kindo’s proxy.
- Uses your Kindo model registry — Use the same model IDs returned by
GET /v1/models.
Base URL and authentication
Use the api.kindo.ai domain for the Messages API:
| Endpoint | Method | Purpose |
|---|---|---|
https://api.kindo.ai/v1/messages | POST | Create a message |
https://api.kindo.ai/v1/messages/count_tokens | POST | Count tokens for a request |
For self-hosted installations, replace api.kindo.ai with your deployment’s API base URL.
Authentication
Both auth formats work:
| Header | Example | Notes |
|---|---|---|
Authorization: Bearer | Authorization: Bearer YOUR_API_KEY | Preferred for raw HTTP clients |
x-api-key | x-api-key: YOUR_API_KEY | Common for Anthropic-compatible clients |
When both headers are present, Authorization: Bearer takes precedence. If the Authorization header is present but malformed, the request is rejected instead of falling back to x-api-key.
Anthropic headers
Kindo forwards these Anthropic-specific headers to the upstream provider when present:
anthropic-versionanthropic-betax-claude-code-session-id
For raw HTTP requests, include anthropic-version. Anthropic SDKs and Claude Code set the required protocol headers for you.
POST /v1/messages
Create a message using Anthropic’s native Messages API format.
curl https://api.kindo.ai/v1/messages \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "messages": [ { "role": "user", "content": "Explain Kubernetes pod security policies in plain English." } ] }'Common request fields
Kindo validates the core Anthropic fields and passes through the rest, which keeps the endpoint forward-compatible with Anthropic request bodies.
| Field | Required | Notes |
|---|---|---|
model | Yes | Must match a model ID available through GET /v1/models |
messages | Yes | Array of Anthropic messages. Roles must be user or assistant, must alternate, and the first message must be user. Use the top-level system field for system prompts |
max_tokens | Yes | Positive integer |
stream | No | Set true for Server-Sent Events |
system | No | Optional system prompt. Accepts a string or an array of content blocks (use blocks to attach cache_control) |
tools | No | Anthropic tool definitions are passed through |
tool_choice | No | Use Anthropic’s standard tool-choice format |
metadata | No | Accepted, but stripped before proxying upstream |
| Other Anthropic fields | No | Passed through unless Kindo reserves the field for internal routing |
Kindo also strips internal routing fields such as litellm_metadata and proxy_server_request before proxying upstream so clients cannot spoof governance metadata.
Example response
{ "id": "msg_01Aq9w938a90dw8q", "type": "message", "role": "assistant", "model": "claude-sonnet-4-5-20250929", "content": [ { "type": "text", "text": "Pod Security Policies were Kubernetes rules that controlled what a pod was allowed to do..." } ], "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 24, "output_tokens": 134 }}POST /v1/messages/count_tokens
Count tokens for a Messages API request without generating a completion.
curl https://api.kindo.ai/v1/messages/count_tokens \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5-20250929", "system": "You are a concise assistant.", "messages": [ { "role": "user", "content": "Summarize the principle of least privilege." } ] }'count_tokens behavior
modelandmessagesare required.max_tokensis optional forcount_tokens.streamis not supported; omit it or set it tofalse.
Example response:
{ "input_tokens": 20}Streaming
Set "stream": true on POST /v1/messages to receive a standard Anthropic Server-Sent Events stream.
curl -N https://api.kindo.ai/v1/messages \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "stream": true, "messages": [ { "role": "user", "content": "Write a haiku about observability." } ] }'Kindo returns Content-Type: text/event-stream and preserves Anthropic’s event structure:
| Event | Meaning |
|---|---|
message_start | Initial message metadata |
content_block_start | Start of a streamed content block |
content_block_delta | Incremental content chunk |
content_block_stop | End of the content block |
message_delta | Updated usage or stop-reason data |
message_stop | End of the stream |
error | Stream interruption after the stream has started |
ping | Keepalive heartbeat; can be safely ignored |
If the upstream request fails before streaming begins, Kindo returns a normal HTTP error response with Anthropic’s JSON error envelope instead of switching to SSE.
Prompt caching
Block-level prompt caching works through Kindo. Add cache_control to Anthropic content blocks, then verify caching behavior from the usage object in the response.
{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "system": [ { "type": "text", "text": "You are a release-notes assistant. Always answer in bullet points.", "cache_control": { "type": "ephemeral" } } ], "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Summarize the last three deploys for service api." } ] } ]}Look for these fields in usage:
cache_creation_input_tokens— tokens written to cache on this requestcache_read_input_tokens— tokens read from cache on subsequent requests
Known limitation
Top-level cache_control is not currently forwarded by LiteLLM’s Anthropic TypedDict layer. Use block-level cache_control instead.
If your client requires an Anthropic beta header for prompt caching, use a header value supported by your target model. Internal end-to-end verification confirmed that anthropic-beta: prompt-caching-2024-07-31 works with Kindo’s proxy.
Tool use
Anthropic-format tools pass through to the upstream provider. Standard tool_use and tool_result content blocks work without any Kindo-specific wrapping.
{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "tools": [ { "name": "get_weather", "description": "Get the current weather for a city.", "input_schema": { "type": "object", "properties": { "location": { "type": "string" } }, "required": ["location"] } } ], "messages": [ { "role": "user", "content": "What is the weather in San Francisco?" } ]}Typical flow:
- Send a request with
tools. - Receive an assistant message containing a
tool_useblock. - Execute the tool client-side.
- Send the tool result back in a follow-up user message containing a
tool_resultblock.
Claude Code setup
Point Claude Code at Kindo by setting two environment variables before launching the CLI.
-
Export your Kindo base URL:
Terminal window export ANTHROPIC_BASE_URL=https://api.kindo.ai -
Export your Kindo API key:
Terminal window export ANTHROPIC_API_KEY=YOUR_KINDO_API_KEY -
Start Claude Code normally:
Terminal window claude
Claude Code will send Anthropic-compatible requests to https://api.kindo.ai/v1/messages using your Kindo API key. Kindo also forwards the x-claude-code-session-id header, which preserves Claude Code session context for upstream compatibility.
Why route Claude Code through Kindo
- Centralized governance — Requests inherit the same audit logging, DLP, and access controls as other Kindo APIs.
- One API key — Use the same Kindo API key you already use for other endpoints.
- Model consistency — Claude Code uses the model IDs your organization exposes through Kindo.
If Claude Code returns a model-not-found error, call GET /v1/models first and use one of the IDs available to your organization.
Client examples
Use whichever Anthropic client fits your workflow.
curl https://api.kindo.ai/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: YOUR_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 512, "messages": [ { "role": "user", "content": "Give three practical code review tips." } ] }'import os
from anthropic import Anthropic
client = Anthropic( api_key=os.environ["KINDO_API_KEY"], base_url="https://api.kindo.ai",)
message = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=512, messages=[ { "role": "user", "content": "Explain RBAC in one paragraph." } ],)
print(message.content[0].text)import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ apiKey: process.env.KINDO_API_KEY, baseURL: 'https://api.kindo.ai'});
const message = await client.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 512, messages: [ { role: 'user', content: 'Summarize defense in depth in two sentences.' } ]});
console.log(message.content[0].text);Differences from Anthropic’s direct API
| Topic | Kindo Messages API | Direct Anthropic API |
|---|---|---|
| Base URL | https://api.kindo.ai | https://api.anthropic.com |
| API key | Kindo API key | Anthropic API key |
| Model names | Must match Kindo model IDs from GET /v1/models | Must match Anthropic-enabled models on your Anthropic account |
| Governance | Includes Kindo auth, access control, DLP, audit logging, and metering | Anthropic-only controls |
| Prompt caching | Block-level cache_control works; top-level cache_control currently does not | Depends on Anthropic’s native support |
Error format
Errors use Anthropic’s standard envelope:
{ "type": "error", "error": { "type": "invalid_request_error", "message": "max_tokens is required" }}Kindo maps common HTTP statuses to Anthropic-style error types:
| Status | Error type |
|---|---|
400 | invalid_request_error |
401 | authentication_error |
403 | permission_error |
404 | not_found_error |
429 | rate_limit_error |
5xx | api_error |