Skip to content

Infrastructure Requirements

This guide covers all prerequisites and planning considerations before deploying Kindo in a self-managed environment.

Overview

Kindo can be deployed on any Kubernetes cluster (on-premises, AWS, GCP, Azure, or other cloud providers) using Helm charts.

Minimum Infrastructure

ComponentPurposeMinimum Specification
Kubernetes ClusterApplication runtime3+ nodes, 8 vCPU, 16 GB RAM per node, v1.32+
PostgreSQLPrimary databasePostgreSQL 17+ (17.4 recommended)
RedisCaching and sessionsRedis 7.0+
RabbitMQMessage queueRabbitMQ 3.13+
S3-Compatible StorageFile storageAWS S3, MinIO, or compatible
Vector DatabaseSemantic searchPinecone (pod-based) or Qdrant (self-hosted)
Ingress ControllerTraffic routingNGINX, Traefik, or similar
Certificate ManagerSSL/TLScert-manager or manual certs
DNS ServiceDomain managementAny DNS provider
GPU Nodes (optional)Self-hosted AI modelsNVIDIA GPUs with CUDA support

Deployment Sizing

SizeUse CaseK8s NodesDatabase
DevDevelopment/Testing3 nodes (8 vCPU, 16 GB)2 vCPU, 4 GB RAM
SmallTeams up to 50 users5 nodes (8 vCPU, 16 GB)4 vCPU, 8 GB RAM
Medium50—200 users8 nodes (16 vCPU, 32 GB)8 vCPU, 16 GB RAM
Large200—1000 users15 nodes (32 vCPU, 64 GB)16 vCPU, 32 GB RAM

Kubernetes Cluster Requirements

Cluster Specifications

Minimum:

  • Kubernetes version 1.32 or higher
  • 3 nodes minimum (for HA)
  • 8 vCPU and 16 GB RAM per node minimum
  • 100 GB SSD per node
  • Network plugin: Calico, Cilium, or Flannel
  • Dynamic storage provisioning enabled

GPU Node Requirements (for self-hosted models):

  • NVIDIA drivers installed
  • nvidia-container-runtime configured
  • NVIDIA device plugin or GPU Operator installed
  • Nodes labeled for workload targeting:
Terminal window
kubectl label nodes <gpu-node-name> nvidia.com/gpu=true
kubectl label nodes <gpu-node-name> accelerator=nvidia-h100

Required Kubernetes Features

  • Storage: Dynamic volume provisioning, default storage class, ReadWriteOnce volumes
  • Networking: LoadBalancer or NodePort support, network policies (recommended), ingress controller
  • RBAC: Enabled (required)

Pre-installed Components

ComponentPurpose
Ingress ControllerHTTP(S) routing (NGINX, Traefik, or cloud provider)
cert-managerSSL certificate management
metrics-serverResource metrics

Optional but recommended: External Secrets Operator, Prometheus/Grafana, Loki

Database Requirements

PostgreSQL

Version: 17.0+ (17.4 recommended)

Specifications: 4 vCPU, 8 GB RAM, 100 GB SSD, 200 concurrent connections

Required databases:

CREATE DATABASE kindo; -- Main application
CREATE DATABASE unleash; -- Feature flags
CREATE DATABASE litellm; -- AI model proxy
CREATE DATABASE ssoready; -- SSO authentication

Create dedicated users for each service with appropriate grants. Connection strings follow the format: postgresql://username:password@hostname:5432/database_name

Production HA: Use streaming replication, managed PostgreSQL (AWS RDS, Cloud SQL, Azure Database), automatic failover, and daily backups.

Redis

Version: 7.0+ (7.2 recommended). Minimum 2 GB memory, 100 concurrent connections.

Kindo uses Redis Streams for real-time conversation streaming between backend workers and the API layer. All stream operations for a given conversation must resolve to the same Redis node. Deploy Redis in standalone mode — a single primary instance with no sharding.

Cloud provider compatibility

AWS ElastiCache — Use Cluster Mode Disabled with a single node (no replicas). Recommended: num_node_groups = 1, replicas_per_node_group = 0, engine version 7.0+. Cluster Mode Enabled with 2+ shards is not supported.

Azure Cache for Redis — Use Basic or Standard tier (single-node, non-clustered). Premium/Enterprise clustered tiers with 2+ shards are not supported.

Google Cloud Memorystore — Use Basic tier (standalone instance). Redis Cluster mode is not supported.

Environment variables
VariableRequiredDescription
REDIS_URLYesConnection string (e.g., redis://host:6379 or rediss://... for TLS)
REDIS_CA_CERT_PEMNoPEM-encoded CA certificate for TLS with a private CA

Streaming tuning (optional — defaults are suitable for most deployments):

VariableDefaultDescription
STREAM_TTL_SECONDS900 (15 min)How long streams persist before expiring
CHAT_STREAM_XREAD_BLOCK_TIME_MS10Blocking wait time for new events (ms)
CHAT_STREAM_MAX_RETRIES20Retries when waiting for a stream to appear
CHAT_STREAM_RETRY_DELAY_MS500Delay between retries (ms)
Troubleshooting Redis

Conversations hang with no streaming output — Most commonly caused by a sharded Redis Cluster or unsupported Sentinel/replica deployment. The task worker writes to a stream on one shard or node, but the API reads from a different one where the stream doesn’t exist. Reconfigure to a standalone single-node deployment.

How to check your Redis mode:

redis-cli INFO server | grep redis_mode
  • redis_mode:standalone — compatible
  • redis_mode:sentinel or redis_mode:cluster — not supported; reconfigure to standalone

For AWS ElastiCache, check the replication group’s Cluster Mode setting in the console or via:

Terminal window
aws elasticache describe-replication-groups \
--replication-group-id your-group-id \
--query 'ReplicationGroups[0].ClusterMode'

RabbitMQ

Version: 3.13+. 2 vCPU, 2 GB RAM, 20 GB disk. Management plugin enabled.

Production HA: 3+ node cluster with quorum queues, or managed service.

Storage Requirements

S3-Compatible Object Storage

Options: AWS S3, MinIO, GCS (S3 compatibility), Azure Blob (S3 compatibility), Ceph

Required buckets:

BucketPurposeAccess
kindo-uploadsUser file uploadsPrivate
kindo-audit-logsCompliance audit logsPrivate (strict)
kindo-backupsDatabase backupsPrivate

External Service Requirements

Vector Database (choose one)

Pinecone (managed): Create a pod-based (not serverless) index with cosine metric and 1536 dimensions.

Qdrant (self-hosted): Deploy in Kubernetes with cosine distance, 1536 vector size. 3+ replicas for HA.

AI/LLM Services (at least one)

ProviderBest For
OpenAIMost versatile, GPT-4o, o1
Anthropic ClaudeComplex reasoning, long context
Azure OpenAIEnterprise compliance
GroqFast inference, low latency

For self-hosted models, see GPU requirements:

Use CaseGPU Requirements
Embedding models only1x GPU, 8 GB+ VRAM
Small LLMs (7B—13B)1x GPU, 16 GB+ VRAM
Medium LLMs (30B—70B)2—4x GPUs, 24 GB+ each
Large LLMs (70B+)4—8x GPUs, 80 GB+ each

Audit Logging

Syslog server supporting RFC3164, accessible from the cluster on TCP/UDP 514. 1+ year log retention recommended.

Email Service

SMTP server, Amazon SES, SendGrid, or Mailgun.

Network and DNS

DNS

Control over a domain or subdomain with the ability to create A/CNAME records.

ComponentSubdomain Example
Frontendapp.kindo.company.com
APIapi.kindo.company.com
SSOReadysso.kindo.company.com
LiteLLMlitellm.kindo.company.com
Unleashunleash.kindo.company.com

Firewall Rules

Inbound: 443 (HTTPS), 80 (HTTP redirect)

Outbound: 443 (external APIs), 5432 (PostgreSQL), 6379 (Redis), 5672 (RabbitMQ), 514 (Syslog)

Security

Encryption

  • At rest: PostgreSQL, Redis, S3, and Kubernetes secrets encryption
  • In transit: HTTPS for all web traffic, TLS for database and cache connections

Secret Management

Use External Secrets Operator to sync from AWS Secrets Manager, HashiCorp Vault, Google Secret Manager, or Azure Key Vault.

Required Tools

ToolVersion
Helm3.8.0+
kubectl1.32+
jqLatest
yqLatest

Pre-Deployment Checklist

  • Kubernetes cluster provisioned (v1.32+, 3+ nodes)
  • Ingress controller and cert-manager installed
  • PostgreSQL provisioned with all four databases
  • Redis and RabbitMQ provisioned
  • S3-compatible storage with required buckets
  • Vector database configured (Pinecone or Qdrant)
  • At least one AI provider configured
  • Email service credentials obtained
  • Syslog server endpoint accessible
  • DNS records planned
  • SSL certificate strategy decided
  • Helm 3.8+ and kubectl installed
  • Kindo registry credentials received
  • All API keys and passwords stored securely

Next Steps

Proceed to the Installation Guide for step-by-step deployment instructions, or see the AI Model Deployment Guide for detailed guidance on deploying and configuring the AI models that power your Kindo installation.