Kindo Self-Managed Infrastructure Requirements
This document outlines the minimum infrastructure requirements for deploying and running the Kindo application stack in a self-managed environment. These prerequisites are designed to be vendor-agnostic, allowing deployment on various cloud providers, on-premises data centers, or bare metal servers.
1. Kubernetes Cluster Requirements
A Kubernetes cluster compliant with the following specifications is required:
Version: 1.29 or higher.
Architecture:
High Availability (HA) Control Plane recommended for production environments.
Container Runtime:
containerd
1.6+ or compatible CRI implementation.CNI Plugin: A CNI plugin supporting Network Policies is required (e.g., Calico 3.24+, Cilium 1.12+).
Node Requirements:
Minimum Cluster Size (Production Recommendation): 3 Control Plane nodes, 3 Worker nodes. Smaller setups may work for non-production use.
Node Pools/Groups (Recommended): Define node groups or use taints/tolerations and node selectors/affinity to schedule workloads appropriately. Example node purposes and suggested minimum resources:
Group Purpose
Labels/Taints (Example)
Instance Requirements (Minimum)
Notes
General Workloads
workload=standard
4 vCPU, 16GB RAM
For core services, API, UI, etc.
Compute Intensive
workload=compute
8 vCPU, 32GB RAM
For processing-heavy tasks.
Data Processing
workload=llama-indexer
8 vCPU, 64GB RAM
Specifically for
llama-indexer
service.GPU Accelerated (Opt.)
accelerator=nvidia
4 vCPU, 16GB RAM + GPU
If GPU-based features are needed.
Node Configuration: Ensure nodes can pull images from the specified container registry (See Section 8).
Essential Kubernetes Components:
Metrics Server: 0.6+ (for HPA and resource metrics).
Cluster Autoscaler: 1.24+ (or equivalent scaling mechanism, highly recommended).
CSI Storage Driver: Compatible with your underlying storage infrastructure to provide PersistentVolumes.
Ingress Controller: Nginx Ingress, Traefik, or similar, to manage external access.
Certificate Management: cert-manager or equivalent, for automated TLS certificate provisioning for ingress resources.
GPU Support (Optional):
If using GPU-accelerated features:
NVIDIA device plugin installed.
Compatible NVIDIA driver (e.g., 525.60.13+).
CUDA Toolkit compatible with application requirements (e.g., 12.0+).
Bare Metal / Self-Managed Considerations:
Robust
etcd
backup and restore strategy.Control Plane monitoring and alerting.
Automated Kubernetes certificate rotation.
Adherence to security best practices (e.g., CIS Benchmarks).
2. Database Requirements
Kindo requires access to MySQL and PostgreSQL compatible databases.
MySQL Compatible Database
Version: 8.0 or higher.
Configuration:
A dedicated database (e.g.,
kindo_main
) and user with full privileges on that database.SSL/TLS encryption for connections is mandatory.
High Availability (HA) setup (e.g., Primary/Replica) recommended for production.
Regular automated backups with point-in-time recovery (PITR) capability.
Performance (Minimum Recommendation):
4 vCPUs, 16GB RAM (adjust based on load).
SSD-backed storage (minimum 50GB, monitor usage).
Sufficient connection limit (e.g., 1000+ concurrent, monitor usage).
PostgreSQL Compatible Database
Version: 14.x or higher
Configuration:
A dedicated database and user with full privileges on that database.
SSL/TLS encryption for connections is recommended.
High Availability (HA) setup recommended for production.
Regular automated backups.
Performance (Minimum Recommendation):
2 vCPUs, 4GB RAM (adjust based on load).
SSD-backed storage (minimum 10GB, monitor usage).
Note: Some components deployed via Helm (like Unleash) might require their own database instances (typically PostgreSQL). Refer to the specific component documentation and Helm chart values.
3. Message Queue Requirements
A RabbitMQ-compatible message queue service is required.
Version: 3.11 or higher.
Configuration:
High Availability (HA) mode (e.g., Mirrored Queues, Quorum Queues) recommended for production.
Persistent message storage, preferably on SSDs.
TLS encryption for client connections is highly recommended.
A dedicated virtual host (vhost) and user credentials.
Resource Requirements (Example):
Cluster Size
vCPU (per node)
Memory (per node)
Storage (Total)
Notes
3 Nodes (HA)
2
8GB
50GB+ SSD
Suitable for moderate load.
5 Nodes (HA)
4
16GB
200GB+ SSD
For higher throughput/resilience.
Monitoring: Monitor queue depths, connection counts, consumer acknowledgements, and resource utilization.
4. Cache & Session Storage Requirements
A Redis-compatible caching service is required.
Version: 7.0 or higher.
Configuration:
Cluster Mode enabled for scalability and HA is recommended for production.
Persistence (e.g., AOF with
fsync everysec
) configured according to data durability needs.TLS encryption for client connections is highly recommended.
Access credentials (password/ACL).
Performance (Example):
Node Size
vCPU
Memory
Replicas (if Clustered)
Max Connections
Medium
2
8GB
1-2 per primary
10,000+
Large
4
16GB
2-3 per primary
30,000+
Monitoring: Monitor memory usage, CPU utilization, connection count, and cache hit rate.
5. Vector Database Requirements
A Pinecone index or compatible vector database service is required for semantic search capabilities.
Pinecone:
Index Configuration:
Metric:
cosine
Dimensions:
1536
(confirm this matches the embedding model used)Pod Type/Size: Select based on expected data volume and query load (e.g.,
p1.x1
or higher).
Credentials: Obtain API Key and Environment (Region).
6. Embeddings Generation Service (Optional / Alternative)
By default, Kindo may include services for generating embeddings. However, you might configure it to use an external service.
Self-Hosted via Kindo: Requires GPU-accelerated nodes as specified in Section 1.
Managed Service Alternative: If configuring Kindo to use an external provider:
Obtain API endpoint and credentials.
Ensure network connectivity from the Kubernetes cluster.
Supported Providers Examples: OpenAI Embeddings API, Cohere Embed, Google Vertex AI Embeddings API.
7. Object Storage Requirements
An S3-compatible object storage service is required.
Required Buckets:
kindo-uploads
(Example Name):Purpose: Storage for user-uploaded files.
Access: Read/Write access required for Kindo services. Consider private access controls.
Features: Versioning recommended. Lifecycle policies for cleanup optional.
Configuration:
Must provide an HTTPS endpoint.
Server-side encryption (SSE-S3, SSE-KMS, or equivalent) is highly recommended.
Credentials: Provide S3-compatible access key, secret key, region (if applicable), and bucket names during application configuration.
Supported Providers: AWS S3, Google Cloud Storage, MinIO, etc.
8. Secrets Management Requirements
A secure method for managing and injecting secrets (API keys, database passwords, etc.) into Kindo application pods is required. Using the External Secrets Operator (ESO) is the recommended approach.
ESO Installation: ESO needs to be installed in the Kubernetes cluster (see Runbook).
Secret Backend Provider: You need a secrets management system that ESO supports. Examples:
HashiCorp Vault
AWS Secrets Manager
Azure Key Vault
Google Secret Manager
1Password Secrets Automation
Doppler
etc. (Refer to ESO documentation for the full list and configuration details).
Access Configuration:
Configure ESO to securely authenticate with your chosen backend provider (e.g., using IAM roles, Kubernetes Service Account credentials, Vault AppRole, etc.).
Grant ESO's service account(s) read-only access to the specific secrets required by Kindo applications within your backend provider.
Ensure audit logging is enabled on your secret backend provider to track secret access.
Secrets Structure: Secrets should be stored in the backend provider, organized logically (e.g., one secret entry per Kindo application containing all its required environment variables). Kindo provides environment variable templates (
env-templates
) to guide this.
9. Application Distribution Access
Access is required to download Kindo application container images and Helm charts.
Container Images
Registry: Kindo applications are distributed as container images from a private OCI-compliant registry (e.g.,
registry.kindo.ai
).Credentials: You will be provided with credentials (username/password or access token) to access this registry.
Cluster Configuration: Each Kubernetes namespace running Kindo services will need a
kubernetes.io/dockerconfigjson
secret containing these credentials, allowing nodes to pull the images. Alternatively, configure a cluster-wide image pull secret solution.Network Access: Ensure Kubernetes nodes have network connectivity to the Kindo container registry. Consider IP allowlisting if applicable.
Helm Charts
Registry: Kindo Helm charts are distributed via an OCI registry (e.g.,
oci://registry.kindo.ai/kindo-helm
).Credentials: Access to the Helm chart registry requires authentication using the same credentials provided for container images (
helm registry login
).Tooling: Helm v3.8.0 or higher is required for OCI support.
10. Network Requirements
Connectivity:
Kubernetes cluster nodes must have outbound internet access (for pulling images, external APIs like Pinecone, etc.). Configure HTTP/HTTPS proxy if necessary.
Kubernetes pods must be able to resolve and connect to all provisioned backend services (Databases, RabbitMQ, Redis, Object Storage, Vector DB, Secrets Manager). This typically involves configuring VPC peering, firewall rules, security groups, or network policies.
Ensure proper DNS resolution within the cluster and for external services.
Ingress & Load Balancing:
An Ingress Controller must be installed and configured.
A Load Balancer (cloud provider LB, MetalLB, or similar) must be configured to expose the Ingress Controller to users/clients.
Configure DNS A/CNAME records pointing your desired hostnames (e.g.,
kindo.yourcompany.com
,unleash.yourcompany.com
) to the external IP address of the Load Balancer.
TLS Termination:
TLS termination should ideally happen at the Load Balancer or Ingress Controller level.
Automated certificate management (e.g., using
cert-manager
with Let's Encrypt or your internal CA) is highly recommended.
11. Monitoring & Observability Requirements
Integration with monitoring and logging systems is crucial for operational visibility.
Metrics
Protocol: Kindo applications are instrumented to push metrics using the OpenTelemetry Protocol (OTLP).
Collector: You need to deploy an OpenTelemetry Collector (or an observability platform agent with OTLP ingest capabilities) within or accessible to your Kubernetes cluster. This collector will receive metrics from Kindo applications.
Configuration: Configure Kindo applications (likely via environment variables sourced from your secrets backend) with the endpoint address of your OTLP collector (e.g.,
otel-collector.otel-namespace.svc.cluster.local:4317
).Backend: Configure your OpenTelemetry Collector to export the received metrics to your chosen monitoring backend system (e.g., Prometheus, VictoriaMetrics, Datadog, Dynatrace, etc.).
Dashboards/Alerting: Set up dashboards (e.g., in Grafana) and alerts in your monitoring backend based on the collected metrics (resource utilization, error rates, queue depths, custom application metrics, etc.).
Logging
Two types of logging need to be handled:
General Application Logs (stdout/stderr):
Collection: Kindo applications primarily log operational information to standard output/error (stdout/stderr) within their containers. A cluster-level logging agent (e.g., Fluentd, Fluent Bit, Vector, Loki Promtail) is required to collect these logs.
Aggregation: Forward collected logs from the agent to your centralized logging system (e.g., ELK, Splunk, Loki, Datadog Logs).
Format: Logs are typically JSON structured. Ensure your logging system can parse this format.
Retention: Define a suitable retention policy (e.g., 30 days minimum) for operational monitoring and troubleshooting.
Audit Logs (via Syslog - Mandatory):
Source: The dedicated
audit-log-exporter
Kindo service streams critical security and audit events.Requirement: You must provision a Syslog endpoint capable of receiving logs via the RFC3164 protocol (typically UDP port 514).
Accessibility: This Syslog endpoint must be network-accessible from the Kubernetes cluster where the
audit-log-exporter
pod runs.Configuration: The
audit-log-exporter
service must be configured (via environment variables sourced from your secrets backend) with the hostname/IP address, port, and protocol (UDP/TCP) of your Syslog server.Security: If required, ensure appropriate firewall rules and potentially TLS configuration (if your Syslog server supports RFC5425/encrypted Syslog and the exporter is configured for it, though RFC3164 is the primary requirement stated) are in place.
Retention: Audit logs typically require longer retention periods based on compliance requirements (e.g., 1 year or more). Configure this in your Syslog server/log management system.
Tracing (Optional)
Protocol: Kindo applications may support OpenTelemetry Protocol (OTLP) for distributed tracing.
Collector: If tracing is desired, deploy an OpenTelemetry Collector within the cluster configured to receive OTLP traces.
Backend: Forward traces from the Collector to a compatible tracing backend (e.g., Jaeger, Tempo, Datadog APM, Dynatrace).
12. External Service Integrations
Ensure you have accounts and necessary credentials/configurations for any external services Kindo integrates with:
Email Service (SES/SMTP): For sending notifications (requires credentials, endpoint details).
WorkOS Auth: WorkOS account to configure authentication (SSO, Google Auth, Microsoft Auth)