This document provides detailed instructions for deploying the Kindo platform in a self-managed environment. These instructions are intended for technical operators responsible for infrastructure deployment and application management.
Kindo’s architecture is modular by design, allowing for flexible deployment across different environments, including various cloud providers and on-premises infrastructure. This guide will walk you through the deployment process, starting with infrastructure provisioning and continuing through application deployment.
Architecture Overview
Kindo consists of four primary modules:
kindo-infra: Core infrastructure module (AWS-focused, but replaceable with your own infrastructure)
kindo-secrets: Configuration and secrets management (cloud-agnostic)
kindo-peripheries: Supporting Kubernetes components (Unleash, External Secrets Operator, etc.)
kindo-applications: Core Kindo application services (API, Next.js frontend, LiteLLM, etc.)
Prerequisites
Before beginning the deployment process, ensure you have:
Terraform (version 1.11.3 or later)
Kubernetes cluster (if not using kindo-infra)
kubectl configured with access to your Kubernetes cluster
Helm (version 3.x)
Secret management solution (AWS Secrets Manager, HashiCorp Vault, Doppler, etc.)
Access to Kindo container registry (credentials will be provided separately)
Deployment Options
Kindo can be deployed through two main paths:
AWS-based Deployment: Using kindo-infra to provision all required AWS resources
Bring Your Own Infrastructure (BYOI): Using your existing infrastructure and Kubernetes cluster
Option 1: AWS-based Deployment
This is the fully managed approach using the kindo-infra module, which provides: - EKS Kubernetes cluster - RDS (PostgreSQL) - ElastiCache (Redis) - Amazon MQ (RabbitMQ) - S3 storage - Route53 DNS configuration - Optional Client VPN and Syslog collection
Option 2: Bring Your Own Infrastructure
If you’re using another cloud provider or an on-premises environment, you’ll need to provide: - A Kubernetes cluster - Database services (PostgreSQL and MySQL) - Redis cache - RabbitMQ message broker - Object storage solution - DNS configuration
Deployment Workflow
Regardless of your infrastructure choice, the deployment follows a specific workflow:
Infrastructure: Deploy or identify existing infrastructure resources
Secrets Management: Generate application configuration
Peripheries: Deploy supporting Kubernetes components
Applications: Deploy core Kindo services
Step-by-Step Deployment Guide
Step 1: Infrastructure Provisioning
Option 1A: Using kindo-infra (AWS)
Create a new directory for your deployment:
mkdir kindo-deployment && cd kindo-deploymentCreate a main.tf file for kindo-infra:
module "kindo_infra" {
source = "path/to/kindo-infra"
# Required variables
project = "your-project-name"
environment = "prod" # or "dev", "staging", etc.
region = "us-west-2" # AWS region
# Core infrastructure configuration
vpc_cidr_base = "10.0.0.0/16"
availability_zone_names = ["us-west-2a", "us-west-2b", "us-west-2c"]
# EKS configuration
cluster_name = "your-cluster-name"
cluster_version = "1.29"
cluster_endpoint_public_access = true # Set to false for production
cluster_endpoint_private_access = true
# Database configuration
postgres_db_name = "kindo"
mysql_db_name = "kindo_unleash"
# S3 bucket names
s3_uploads_bucket_name = "your-project-uploads"
s3_audit_logs_bucket_name = "your-project-audit-logs"
# DNS configuration
create_public_zone = true
base_domain = "your-domain.com"
create_wildcard_certificate = true
# Additional options as needed
}
output "infrastructure_outputs" {
value = module.kindo_infra
sensitive = true
}Initialize and apply the Terraform configuration:
terraform init
terraform applyThis process will take approximately 15-25 minutes to complete.
Option 1B: Using Your Own Infrastructure
If you’re bringing your own infrastructure, you’ll need to ensure the following resources are available:
Kubernetes Cluster:
Version 1.23 or higher
Node groups with sufficient resources (see resource planning section below)
Proper network configuration and security settings
Databases:
PostgreSQL 14+ instance
MySQL 8+ instance for Unleash
Ensure proper credentials and network accessibility
Cache and Messaging:
Redis instance
RabbitMQ server
Ensure proper credentials and network accessibility
Storage:
Object storage solution (e.g., S3, GCS, Azure Blob Storage)
Create buckets for uploads and audit logs
DNS:
Configure domain and subdomains as needed
Step 2: Secrets Management
The kindo-secrets module generates application configuration from templates. This step is required regardless of your infrastructure choice.
Create a secrets.tf file:
module "kindo_secrets" {
source = "path/to/kindo-secrets"
# Variables to populate in env templates
template_variables = {
# Core connections
"POSTGRES_URL" = "<Your PostgreSQL connection string>"
"RABBITMQ_URL" = "<Your RabbitMQ connection string>"
"REDIS_URL" = "<Your Redis connection string>"
"MYSQL_URL" = "<Your MySQL connection string>"
# S3/Object storage
"S3_BUCKET" = "<Your uploads bucket name>"
"AUDIT_S3_BUCKET" = "<Your audit logs bucket name>"
# API keys
"ANTHROPIC_API_KEY" = var.anthropic_api_key
"OPENAI_API_KEY" = var.openai_api_key
# Domain configuration
"APP_HOST" = "app.your-domain.com"
"API_HOST" = "api.your-domain.com"
# Additional variables as needed
}
# Additional overrides if necessary
override_values = {
# Uncomment and add overrides if needed
# "api" = {
# "SOME_KEY" = "some_value"
# }
}
}
# Store secrets in your secrets management system
# Example for AWS Secrets Manager:
resource "aws_secretsmanager_secret" "app_configs" {
for_each = module.kindo_secrets.configs
name = "${var.project}-${var.environment}-${each.key}"
}
resource "aws_secretsmanager_secret_version" "app_configs" {
for_each = module.kindo_secrets.configs
secret_id = aws_secretsmanager_secret.app_configs[each.key].id
secret_string = each.value
}Apply the configuration:
terraform applyThe output will include JSON configurations for each Kindo application.
Step 3: Deploy Peripheries
The kindo-peripheries module deploys essential supporting components on your Kubernetes cluster:
Create a peripheries.tf file:
module "kindo_peripheries" {
source = "path/to/kindo-peripheries"
# Explicitly pass the providers configured in this root module
providers = {
kubernetes = kubernetes
helm = helm
}
# Pass registry credentials
registry_username = var.registry_username
registry_password = var.registry_password
# Pass Cluster Info (for AWS deployments)
aws_region = local.region
cluster_name = local.cluster_name
# OpenTelemetry Collector Configuration (if using EKS)
enable_otel_collector_cr = var.enable_otel_collector_cr
otel_collector_iam_role_arn = module.kindo_infra.otel_collector_iam_role_arn
otel_collector_config_region = local.region
# otel_collector_log_group_name can use default or be set via variable
# otel_collector_namespace can use default or be set via variable
# ExternalDNS Configuration (if using EKS)
enable_external_dns = var.enable_external_dns
external_dns_iam_role_arn = module.kindo_infra.external_dns_iam_role_arn
external_dns_domain_filter = var.base_domain
# Use cluster name as default TXT owner ID if specific one isn't provided
external_dns_txt_owner_id = coalesce(var.external_dns_txt_owner_id, local.cluster_name)
# Peripheries Configuration Map (Helm charts)
peripheries_config = {
# --- ALB Ingress Controller (for AWS) --- #
alb_ingress = {
install = true
helm_chart_version = "1.7.1"
namespace = "kube-system"
create_namespace = false
values_content = templatefile("${path.module}/values/alb-ingress.yaml", {
cluster_name = local.cluster_name
region = local.region
controller_role_arn = module.kindo_infra.alb_controller_role_arn
})
}
# --- Cert Manager --- #
cert_manager = {
install = true
helm_chart_version = "v1.14.5"
namespace = "cert-manager"
create_namespace = true
dynamic_helm_sets = {
"installCRDs" = "true"
}
}
# --- External Secrets Operator --- #
external_secrets_operator = {
install = true
helm_chart_version = "0.9.9"
namespace = "external-secrets"
create_namespace = true
values_content = templatefile("${path.module}/values/external-secrets-operator.yaml", {
role_arn = module.kindo_infra.external_secrets_role_arn != null ? module.kindo_infra.external_secrets_role_arn : ""
})
secret_stores = {
"aws-secrets-manager" = {
provider = "aws"
config = {
service = "SecretsManager"
region = local.region
service_account_name = "external-secrets"
service_account_namespace = "external-secrets"
}
}
}
}
# --- Unleash (Feature Flags) --- #
unleash = {
install = true
helm_chart_version = "5.4.3"
namespace = "unleash"
create_namespace = true
values_content = templatefile("${path.module}/values/unleash.yaml", {
admin_password = local.unleash_admin_password
admin_token = local.unleash_admin_token
client_token = local.unleash_client_token
frontend_token = local.unleash_frontend_token
domain_name = local.domain_name
# WARNING: The following import variables should typically only be set
# during the *initial* deployment to avoid re-importing on restarts
# json_content = file("${path.module}/feature_flags.json")
# import_project = "default"
# import_environment = "development"
})
dynamic_helm_sets = {
"ingress.hosts[0].host" = "unleash.${local.domain_name}"
}
}
# --- Unleash Edge --- #
unleash_edge = {
install = true
helm_chart_version = "3.0.0"
namespace = "unleash"
create_namespace = false # Installs in the same namespace as unleash
values_content = templatefile("${path.module}/values/unleash-edge.yaml", {
unleash_tokens = local.unleash_edge_tokens
domain_name = local.domain_name
})
}
# --- Presidio (PII Detection) --- #
presidio = {
install = true
helm_chart_version = "2.1.95"
namespace = "presidio"
create_namespace = true
values_content = file("${path.module}/values/presidio.yaml")
}
}
depends_on = [
module.kindo_infra,
module.kindo_secrets
]
}Create values files for each component in a values/ directory. Examples are available in the examples/kindo-with-peripheries/values/ directory.
Apply the configuration:
terraform applyThis step deploys:
Core Components:
Unleash and Unleash Edge (feature flag services)
External Secrets Operator (syncs secrets to Kubernetes)
Cert Manager (certificate management)
ALB Ingress Controller (for AWS) or Ingress NGINX (for other environments)
Optional Components (based on your configuration):
Microsoft Presidio (PII detection and anonymization)
External DNS (automatic DNS record management for AWS Route53)
OpenTelemetry Collector Custom Resource (for use with the EKS ADOT add-on)
Step 4: Deploy Kindo Applications
The final step deploys the core Kindo applications:
Create an applications.tf file:
module "kindo_applications" {
source = "path/to/kindo-applications"
# Basic configuration
project = var.project
environment = var.environment
domain = "your-domain.com"
# Kubernetes configuration
kubernetes_host = "<your Kubernetes API endpoint>"
kubernetes_token = "<your Kubernetes token>" # Optional
kubernetes_cluster_ca_certificate = var.cluster_ca_cert
# Helm registry access
registry_username = var.kindo_registry_username
registry_password = var.kindo_registry_password
# Application configuration
api_enabled = true
next_enabled = true
litellm_enabled = true
llama_indexer_enabled = true
credits_enabled = true
external_sync_enabled = true
external_poller_enabled = true
audit_log_exporter_enabled = true
cerbos_enabled = true
# Helm values - either files or direct content
api_values_content = file("${path.module}/values/api.yaml")
next_values_content = file("${path.module}/values/next.yaml")
# Add other application values as needed
# Reference to secrets created by External Secrets Operator
api_secret_ref_name = "api-config"
next_secret_ref_name = "next-config"
# Add other secret references as needed
}Create values files for each application in a values/ directory.
Apply the configuration:
terraform apply
This step deploys all core Kindo services.
Post-Deployment Configuration
After deployment, perform these additional configuration steps:
Verify Connectivity:
kubectl get pods -n kindoConfigure DNS: Ensure your domain points to the correct ingress endpoints.
Initialize Unleash: Configure feature flags for your environment.
Access the Application: Navigate to your configured domain (e.g., https://app.your-domain.com).
Resource Planning
When planning your deployment, consider these resource requirements:
Minimal Production Environment
Kubernetes:
3-5 nodes (minimum)
16 CPU cores total
64GB RAM total
Databases:
PostgreSQL: db.t3.small (minimum), 20GB storage
MySQL: db.t3.small (minimum), 20GB storage
Cache/Messaging:
Redis: cache.t3.small (minimum)
RabbitMQ: mq.t3.micro (minimum)
Recommended Production Environment
Kubernetes:
5-8 nodes
32 CPU cores total
128GB RAM total
Databases:
PostgreSQL: db.m5.large, 100GB storage
MySQL: db.t3.medium, 50GB storage
Cache/Messaging:
Redis: cache.m5.large
RabbitMQ: mq.m5.large
Maintenance and Operations
Backups
Database Backups: Enable automatic backups for PostgreSQL and MySQL
Object Storage: Implement lifecycle policies for uploads bucket
Kubernetes State: Consider a backup solution for Kubernetes resources
Monitoring
Deploy your preferred monitoring solution (Prometheus/Grafana, Datadog, etc.)
Set up alerts for critical service failures
Logging
Implement centralized logging with your preferred solution
Consider enabling the syslog functionality if using kindo-infra
Troubleshooting
Common Issues
Application Connectivity Issues:
kubectl logs -n kindo deployment/api
kubectl describe pod -n kindo <pod-name>Database Connection Problems:
Verify security groups/network policies allow access
Check connection strings in secretsExternal Secrets Issues:
kubectl get externalsecret -A
kubectl describe externalsecret -n kindo <externalsecret-name>Ingress Problems:
kubectl get ingress -A
kubectl describe ingress -n kindo <ingress-name>
Upgrading
When upgrading to a new version of Kindo:
Review the release notes for breaking changes
Update module references to new versions in your Terraform configuration
Run terraform plan to preview changes
Apply application updates first, then peripheries
Consider a phased approach for major upgrades
Multi-Environment Setup
For organizations requiring multiple environments (dev, staging, prod):
Create separate Terraform workspaces or directories for each environment
Use variable files to manage environment-specific configuration
Consider deploying shared infrastructure components once
Implement proper access controls between environments
Security Considerations
API Keys: Securely manage third-party API keys (OpenAI, Anthropic, etc.)
Network Security: Implement proper network segmentation and security groups
Secrets Management: Rotate secrets periodically
Access Control: Implement least privilege principles for all service accounts
Support and Resources
Submit support tickets via your Kindo account portal
Access documentation at docs.kindo.com
Join the Kindo community for questions and best practices
This guide provides a comprehensive overview of the deployment process. For detailed configuration options, consult the README files in each module directory.
Conditional content applied
Infrastructure Deployment (kindo-infra)
This guide provides detailed instructions for deploying the core AWS infrastructure required by the Kindo platform using the kindo-infra Terraform module.
Overview
The kindo-infra module provisions all necessary AWS resources to host the Kindo application stack, including:
Networking: VPC, subnets, NAT gateways, security groups
Compute: EKS cluster with multiple node groups
Databases: PostgreSQL (RDS)
Caching: Redis (ElastiCache)
Messaging: RabbitMQ (Amazon MQ)
Storage: S3 buckets for uploads and audit logs
DNS & Email: Route 53 (optional) and SES (optional)
Client VPN: Secure access to private resources (optional)
Syslog: Enhanced logging infrastructure (optional)
Prerequisites
AWS account with administrative permissions
Terraform v1.11.3 or later
AWS CLI v2 configured with appropriate credentials
Name to delegate (if using DNS/SES features from module)
Verifying AWS CLI and Terraform Setup
Before starting the deployment, verify that your tools are correctly configured:
# Verify AWS CLI installation and configuration
aws --version
aws sts get-caller-identity
# Verify Terraform installation
terraform --version
You should see your AWS account information displayed and a Terraform version of 1.11.3 or higher.
DNS Requirements
If you're using DNS features, you'll need to create the following records:
Frontend:
API:
Upload:
LiteLLM:
Unleash:
Kindo Payload Contents
From the Kindo payload, you'll need the following:
Environment variable templates for application components (in env_templates/)
Kindo registry credentials (in kindo-registry.tfvars)
Terraform modules (in modules/)
Values files for supporting components (in peripheries-values/ and application-values/)
Unleash feature flags JSON (in feature_flags.json)
Deployment Steps
1. Initialize Your Project
Create a new directory for your Kindo deployment:
mkdir kindo-deployment && cd kindo-deployment
2. Set Up Module References
If using local module paths from the payload, organize them as follows:
# Create a modules directory or use symbolic links
mkdir -p modules
cp -r /path/to/payload/modules/kindo-infra modules/
# or
ln -s /path/to/payload/modules/kindo-infra modules/
3. Create Terraform Configuration
Create a main.tf file with the following content, adjusting the variables to match your requirements:
provider "aws" {
region = var.region
}
module "kindo_infra" {
source = "./modules/kindo-infra" # Use relative path to the module
# Required variables
project = var.project
environment = var.environment
region = var.region
availability_zone_names = var.availability_zone_names
vpc_cidr_base = var.vpc_cidr_base
cluster_name = var.cluster_name
s3_uploads_bucket_name = var.s3_uploads_bucket_name
s3_audit_logs_bucket_name = var.s3_audit_logs_bucket_name
# Optional features based on requirements
enable_client_vpn = var.enable_client_vpn
vpn_users = var.vpn_users
syslog_enabled = var.syslog_enabled
enable_ses = var.enable_ses
base_domain = var.base_domain
create_public_zone = var.create_public_zone
# Additional configuration as needed
}
# Output all infrastructure details for use in later modules
output "infrastructure_outputs" {
value = module.kindo_infra
sensitive = true
}
4. Create Variables Configuration
Create a variables.tf file to define all the variables:
variable "project" {
description = "Project name (e.g., kindo)"
type = string
}
variable "environment" {
description = "Deployment environment (e.g., prod, staging)"
type = string
}
variable "region" {
description = "AWS region to deploy resources"
type = string
}
variable "availability_zone_names" {
description = "List of availability zones to use"
type = list(string)
}
variable "vpc_cidr_base" {
description = "Base CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "cluster_name" {
description = "Name for the EKS cluster"
type = string
}
variable "s3_uploads_bucket_name" {
description = "Name for the S3 uploads bucket (must be globally unique)"
type = string
}
variable "s3_audit_logs_bucket_name" {
description = "Name for the S3 audit logs bucket (must be globally unique)"
type = string
}
# Optional features
variable "enable_client_vpn" {
description = "Whether to create AWS Client VPN endpoint"
type = bool
default = false
}
variable "vpn_users" {
description = "List of user names for VPN certificate generation"
type = list(string)
default = []
}
variable "syslog_enabled" {
description = "Whether to deploy syslog infrastructure"
type = bool
default = false
}
variable "enable_ses" {
description = "Whether to configure SES for email sending"
type = bool
default = false
}
variable "base_domain" {
description = "Base domain for Route 53 and SES configuration"
type = string
}
variable "create_public_zone" {
description = "Whether to create a Route 53 public hosted zone"
type = bool
default = true
}
# Add other variables as needed
5. Create terraform.tfvars
Create a terraform.tfvars file with your specific values:
project = "kindo"
environment = "prod"
region = "us-west-2"
availability_zone_names = ["us-west-2a", "us-west-2b", "us-west-2c"]
vpc_cidr_base = "10.0.0.0/16"
cluster_name = "kindo-prod-cluster"
s3_uploads_bucket_name = "kindo-prod-uploads-xyz123" # Must be globally unique
s3_audit_logs_bucket_name = "kindo-prod-audit-logs-xyz123" # Must be globally unique
# Optional features
enable_client_vpn = false # Set to true if you need VPN access
vpn_users = [] # Add usernames if enable_client_vpn is true
syslog_enabled = false # Set to true for enhanced logging
enable_ses = true
base_domain = "example.com"
create_public_zone = true
IMPORTANT: S3 bucket names must be globally unique across all AWS accounts. Use a unique suffix like your organization name or a random string.
6. Initialize and Apply Terraform
Initialize Terraform and apply the configuration:
terraform init
terraform plan -out=infra.tfplan
terraform apply infra.tfplan
The infrastructure deployment may take 15-30 minutes to complete. You can monitor the progress in the terminal.
7. Verify Infrastructure Deployment
After the deployment completes, verify that key components are created successfully:
# Verify EKS cluster creation
aws eks describe-cluster --name <cluster-name> --region <region> --query 'cluster.status'
# Configure kubectl to interact with the cluster
aws eks update-kubeconfig --name <cluster-name> --region <region>
# Verify kubectl connection
kubectl get nodes
# Verify RDS instance status
aws rds describe-db-instances --query 'DBInstances[?DBInstanceIdentifier==`<db-id>`].DBInstanceStatus'
Customization Options
EKS Cluster Configuration
The EKS cluster can be customized with the following variables:
Variable | Description | Default |
---|---|---|
cluster_version | Kubernetes version | "1.29" |
cluster_endpoint_public_access | Enable public access to EKS API | false |
cluster_endpoint_private_access | Enable private access to EKS API | true |
Node Groups Configuration
The module creates three types of node groups that can be customized:
General Purpose Workers
general_purpose_workers_instance_types = ["t3.large", "t3a.large"]
general_purpose_workers_min_size = 1
general_purpose_workers_max_size = 5
general_purpose_workers_desired_size = 3
general_purpose_workers_capacity_type = "SPOT" # or "ON_DEMAND"
Memory Optimized Workers
memory_optimized_workers_instance_types = ["r6i.large", "r5.large"]
memory_optimized_workers_min_size = 1
memory_optimized_workers_max_size = 3
memory_optimized_workers_desired_size = 1
Compute Optimized Workers
compute_optimized_workers_instance_types = ["c6i.large", "c5.large"]
compute_optimized_workers_min_size = 1
compute_optimized_workers_max_size = 5
compute_optimized_workers_desired_size = 3
Database Configuration
PostgreSQL RDS instance can be customized with:
postgres_instance_class = "db.t3.medium"
postgres_allocated_storage = 50
postgres_multi_az = true
postgres_deletion_protection = true
postgres_backup_retention_period = 14
Cache and Messaging Configuration
Redis and RabbitMQ can be customized:
# Redis configuration
redis_node_type = "cache.t3.medium"
redis_multi_az_enabled = true
redis_num_node_groups = 2
redis_replicas_per_node_group = 1
# RabbitMQ configuration
rabbitmq_instance_type = "mq.t3.small"
rabbitmq_deployment_mode = "CLUSTER_MULTI_AZ"
Optional Components
Client VPN Configuration (Optional)
The Client VPN component can be enabled with:
enable_client_vpn = true
vpn_users = ["user1", "user2", "user3"]
client_vpn_cidr_block = "192.168.100.0/22"
client_vpn_split_tunnel = true
When enabled, this will generate client certificates for each user in the vpn_users list.
Syslog Infrastructure (Optional)
The enhanced syslog collection stack can be enabled with:
syslog_enabled = true
syslog_forwarder_instance_type = "t3.micro"
syslog_s3_transition_ia_days = 30
syslog_s3_transition_glacier_days = 90
syslog_log_retention_days = 365
This creates a complete logging infrastructure with Kinesis Firehose, S3 storage, and Glue Catalog for Athena queries.
SES for Email
SES configuration can be enabled with:
enable_ses = true
base_domain = "your-domain.com"
Note: After deployment, you will need to manually verify the domain in SES or request production access if you haven't already.
Key Outputs
The module provides numerous outputs that will be used by subsequent modules:
EKS Cluster Details: Endpoint, name, OIDC provider ARN
Database Connection Strings: PostgreSQL connection details
Redis & RabbitMQ Endpoints: Endpoints and credentials
S3 Bucket Information: Bucket names and access details
VPN Configurations: Client VPN endpoint and certificate details (if enabled)
DNS Information: Route 53 zone details (if created)
You can view all outputs after deployment with:
terraform output infrastructure_outputs
Next Steps
After successfully deploying the infrastructure:
Use the infrastructure_outputs in the next module
Save the Terraform state securely (consider using a remote backend)
Proceed to Secrets Management to configure application secrets and environment variables
Common Issues and Solutions
Insufficient IAM Permissions
Issue: Deployment fails with permission errors. Solution: Ensure your AWS credentials have sufficient permissions for creating all resources (EKS, RDS, ElastiCache, VPC, IAM roles, etc.)
S3 Bucket Name Conflicts
Issue: S3 bucket creation fails because the bucket name is already taken. Solution: Choose unique, globally unique names for your S3 buckets by adding organization-specific prefixes or random suffixes.
EKS Cluster Access
Issue: Unable to access the EKS cluster API after deployment. Solution: If cluster_endpoint_public_access is set to false, you need to be inside the VPC or connected via VPN to access the cluster. Use the AWS CLI to update your kubeconfig:
aws eks update-kubeconfig --name <cluster-name> --region <region>
Resource Quotas
Issue: Deployment fails with quota exceeded errors. Solution: Check your AWS account quotas for services like VPC, EKS, RDS, ElastiCache, etc., and request increases if needed.
Secrets Management (kindo-secrets)
This guide explains how to use the kindo-secrets module to generate and manage application configurations for the Kindo platform.
Overview
The kindo-secrets module is responsible for:
Parsing environment variable templates (.env files)
Substituting variables with actual values from infrastructure outputs or manual input
Generating structured JSON configurations
Preparing these configurations for storage in AWS Secrets Manager
This module is cloud-agnostic and does not directly interact with any specific secrets manager. It produces the configuration data, and your Terraform configuration is responsible for storing this data in AWS Secrets Manager.
Prerequisites
Completed infrastructure deployment with kindo-infra (or your own infrastructure)
Terraform 1.11.3 or later
Access to kindo-secrets module
Access to the .env templates provided in the Kindo payload (located in env_templates/ directory)
External API keys for LLM providers (Anthropic, OpenAI) for testing or production use
Required API Keys
Before proceeding, you'll need to obtain API keys for the following services:
Anthropic API Key: Required for Claude language models
Sign up at Anthropic API
Create an API key in your account dashboard
OpenAI API Key: Required for GPT language models
Sign up at OpenAI API
Create an API key in your account dashboard
Merge API Key (Optional): Required only if using Merge integrations
Sign up at Merge if needed
These API keys will be used in your configuration and securely stored in AWS Secrets Manager.
Deployment Steps
1. Set Up Your Project Structure
If continuing from the infrastructure deployment, use the existing project directory. Otherwise, create a new directory structure:
kindo-deployment/
├── main.tf # Already created for infrastructure
├── variables.tf # Already created for infrastructure
├── terraform.tfvars # Already created for infrastructure
└── env_templates/ # Directory for .env templates
├── api.env
├── next.env
├── litellm.env
├── credits.env
└── ... (other application templates)
2. Copy Environment Templates
Copy the environment templates from the Kindo payload to your project:
mkdir -p env_templates
cp /path/to/kindo/payload/env_templates/* env_templates/
Verify that all necessary templates are copied:
ls -la env_templates/
You should see template files for each application (api.env, next.env, litellm.env, etc.)
3. Add the Secrets Module to Terraform
Update your existing main.tf file to include the kindo-secrets module:
# Existing provider and kindo_infra module declarations...
module "kindo_secrets" {
source = "./modules/kindo-secrets" # Use relative path to the module
template_dir = "${path.module}/env_templates"
# Construct the template_variables map with infrastructure outputs
template_variables = {
# Core identifiers
"PROJECT" = var.project
"ENVIRONMENT" = var.environment
"REGION" = var.region
# Database connections
"POSTGRES_URL" = module.kindo_infra.postgres_connection_string
"REDIS_URL" = module.kindo_infra.redis_connection_string
"RABBITMQ_URL" = module.kindo_infra.rabbitmq_connection_string
# Storage configuration
"storage.access_key" = module.kindo_infra.storage_access_key
"storage.secret_key" = module.kindo_infra.storage_secret_key
"storage.bucket_name" = module.kindo_infra.storage_bucket_name
"storage.region" = module.kindo_infra.storage_region
# Email configuration (if SES is enabled)
"smtp.host" = module.kindo_infra.smtp_host
"smtp.user" = module.kindo_infra.smtp_user
"smtp.password" = module.kindo_infra.smtp_password
"smtp.fromemail" = module.kindo_infra.smtp_fromemail
# Domain configuration
"APP_HOST" = "app.${var.base_domain}"
"API_HOST" = "api.${var.base_domain}"
# Generated secrets
"secrets.nextauthsecret" = random_password.nextauth_secret.result
"secrets.kek" = base64encode(random_bytes.key_encryption_key.hex)
"secrets.litellmapikey" = random_password.litellm_api_key.result
"secrets.litellmadminapikey" = random_password.litellm_admin_api_key.result
"secrets.uminternalapikey" = random_password.um_internal_api_key.result
# External API keys (from variables)
"secrets.merge_api_key" = var.merge_api_key
"secrets.merge_webhook_security" = var.merge_webhook_security
"secrets.anthropic_api_key" = var.anthropic_api_key
"secrets.openai_api_key" = var.openai_api_key
# Unleash configuration (will be filled in by the peripheries module later)
"unleash.client_token" = "placeholder_will_be_updated"
"unleash.frontend_token" = "placeholder_will_be_updated"
"unleash.admin_token" = "placeholder_will_be_updated"
}
# Optional: specific overrides
override_values = {
# Example: Override specific variables for certain applications
# api = {
# "NODE_ENV" = "production"
# }
}
}
# Generate secrets
resource "random_password" "nextauth_secret" {
length = 32
special = true
}
resource "random_bytes" "key_encryption_key" {
length = 32
}
resource "random_password" "litellm_api_key" {
length = 32
special = false
}
resource "random_password" "litellm_admin_api_key" {
length = 32
special = false
}
resource "random_password" "um_internal_api_key" {
length = 32
special = false
}
# Store configurations in AWS Secrets Manager
resource "aws_secretsmanager_secret" "app_configs" {
for_each = module.kindo_secrets.application_configs_json
name = "${var.project}-${var.environment}/${each.key}-app-config"
tags = {
Project = var.project
Environment = var.environment
}
}
resource "aws_secretsmanager_secret_version" "app_configs" {
for_each = module.kindo_secrets.application_configs_json
secret_id = aws_secretsmanager_secret.app_configs[each.key].id
secret_string = each.value
}
4. Add External API Key Variables
Add variables for required external API keys to your variables.tf file:
# External API keys
variable "anthropic_api_key" {
description = "API key for Anthropic"
type = string
sensitive = true
}
variable "openai_api_key" {
description = "API key for OpenAI"
type = string
sensitive = true
}
variable "merge_api_key" {
description = "API key for Merge (if used)"
type = string
sensitive = true
default = ""
}
variable "merge_webhook_security" {
description = "Webhook security token for Merge (if used)"
type = string
sensitive = true
default = ""
}
5. Update Your terraform.tfvars File
Add the external API keys to your terraform.tfvars file:
# Existing infrastructure variables...
# API Keys
anthropic_api_key = "your-anthropic-api-key"
openai_api_key = "your-openai-api-key"
merge_api_key = "" # Optional, leave empty if not used
merge_webhook_security = "" # Optional, leave empty if not used
SECURITY NOTE: For production environments, consider using environment variables or a secure method to pass these sensitive values rather than storing them in your tfvars file.
6. Verify That All Templates Have Been Processed
Before applying the configuration, verify that all required templates are available:
# Check that template directory exists and contains expected files
ls -la env_templates/
# Check template content (optional, for specific template)
cat env_templates/api.env
7. Apply the Configuration
If you're adding this to an existing Terraform deployment, apply the changes:
terraform init
terraform plan -out=secrets.tfplan
terraform apply secrets.tfplan
This will:
Generate the application configurations by parsing templates
Create secrets in AWS Secrets Manager for each application
Store the configuration JSON in these secrets
8. Verify Secret Creation
Verify that the secrets were created successfully in AWS Secrets Manager:
# List secrets with the project prefix
aws secretsmanager list-secrets --query "SecretList[?Name.contains(@, '${var.project}-${var.environment}')].[Name]" --output table
Template Variables Reference
The template_variables map should include all variables needed by your .env templates. Here's a reference of common variables:
Variable | Description | Source | Required By |
---|---|---|---|
POSTGRES_URL | PostgreSQL connection string | kindo-infra output | API, LiteLLM |
REDIS_URL | Redis connection string | kindo-infra output | API, Next.js |
RABBITMQ_URL | RabbitMQ connection URL | kindo-infra output | API, External services |
storage.access_key | S3 storage access key | kindo-infra output | API, Llama Indexer |
storage.secret_key | S3 storage secret key | kindo-infra output | API, Llama Indexer |
storage.bucket_name | S3 bucket name | kindo-infra output | API, Llama Indexer |
storage.region | AWS region for storage | kindo-infra output | API, Llama Indexer |
APP_HOST | Hostname for the Kindo frontend | Custom value | API, Next.js |
API_HOST | Hostname for the Kindo API | Custom value | API, Next.js |
secrets.kek | Key Encryption Key | Generated | API |
secrets.nextauthsecret | NextAuth secret | Generated | Next.js |
secrets.litellmapikey | LiteLLM API key | Generated | API, LiteLLM |
secrets.anthropic_api_key | Anthropic API key | External provider | LiteLLM |
secrets.openai_api_key | OpenAI API key | External provider | LiteLLM |
unleash.client_token | Unleash client token | From peripheries module | API |
Generated Secrets Explanation
The module generates several secrets that are used by different applications:
Secret | Purpose | Notes on Rotation |
---|---|---|
nextauth_secret | Used by Next.js for session security | Can be rotated but invalidates existing sessions |
key_encryption_key | Used for encrypting sensitive data | Care needed when rotating; plan for data re-encryption |
litellm_api_key | Authentication for LiteLLM service | Can be rotated easily |
litellm_admin_api_key | Admin access to LiteLLM service | Can be rotated easily |
um_internal_api_key | Internal API authentication | Can be rotated easily |
Customizing Configurations
1. Custom Environment Variables
If you need to add custom environment variables to an application, you can:
Add the variables to the template files in env_templates/
Provide values in the template_variables map
2. Override Values
The override_values parameter allows you to forcefully override specific values after template substitution:
override_values = {
api = {
"LOG_LEVEL" = "debug"
"NODE_ENV" = "production"
},
next = {
"NEXT_PUBLIC_FEATURE_FLAG" = "true"
}
}
This is useful for environment-specific overrides or temporary changes.
Application Configuration JSON Format
The generated JSON configurations have a structure like:
{
"DATABASE_URL": "postgresql://username:password@hostname:5432/database",
"REDIS_URL": "redis://username:password@hostname:6379",
"AWS_ACCESS_KEY": "AKIAIOSFODNN7EXAMPLE",
"AWS_SECRET_KEY": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"AWS_BUCKET": "kindo-uploads-prod",
"AWS_REGION": "us-west-2",
"KEY_ENCRYPTION_KEY": "base64encodedkey",
"NEXTAUTH_SECRET": "nextauthsecretvalue",
"UNLEASH_API_KEY": "unleashclienttoken",
"NODE_ENV": "production"
}
Accessing Secret Values in Other Modules
To use these secrets in subsequent modules (like kindo-peripheries or kindo-applications):
locals {
secret_arns = {
for key, _ in module.kindo_secrets.application_configs_json :
key => aws_secretsmanager_secret.app_configs[key].arn
}
}
# Pass the secret ARNs to other modules
module "kindo_peripheries" {
# ...
secret_arns = local.secret_arns
# ...
}
Troubleshooting
Missing Template Variables
Issue: Configuration generation fails due to missing variables. Solution: Check the template files for all required variables and ensure they're provided in the template_variables map.
# Check which variables are used in templates
grep -r "\${" env_templates/
API Key Issues
Issue: LLM-related features don't work even though configuration is correct. Solution: Verify API key validity and quotas with the provider. Test keys with a simple curl request:
# Test Anthropic API key
curl -X POST https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model": "claude-3-haiku-20240307", "max_tokens": 10, "messages": [{"role": "user", "content": "Hello"}]}'
# Test OpenAI API key
curl -X POST https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 10}'
Next Steps
Once you've created and stored your application configurations, proceed to Peripheries Deployment to deploy supporting Kubernetes components.
Peripheries Deployment (kindo-peripheries)
This guide explains how to deploy the essential supporting components for the Kindo platform using the kindo-peripheries module.
Overview
The kindo-peripheries module deploys infrastructure periphery components that support the Kindo platform on Kubernetes. These components provide platform-wide capabilities such as:
Unleash: Feature flag management (required)
External Secrets Operator: For syncing secrets from AWS Secrets Manager to Kubernetes
Ingress Controllers: Either AWS ALB Ingress Controller or Ingress NGINX
Cert Manager: For certificate management
OpenTelemetry Collector: For telemetry collection and export (optional)
Presidio: For PII detection and anonymization (optional)
Prerequisites
Completed infrastructure deployment with kindo-infra (or your own infrastructure)
Completed secrets management with kindo-secrets
Kubernetes cluster accessible via kubectl
Terraform 1.11.3 or later
Helm 3.x or later
Registry credentials for the Kindo container registry
Verify Prerequisites
Before proceeding, verify that your environment is properly configured:
# Verify AWS and Terraform setup
aws sts get-caller-identity
terraform --version
# Verify EKS cluster access
aws eks update-kubeconfig --name <cluster-name> --region <region>
kubectl get nodes
# Verify Helm installation
helm version
Registry Credentials
You'll need the Kindo registry credentials provided in the payload. These credentials are typically found in the kindo-registry.tfvars file:
registry_username = "robot$username"
registry_password = "password"
Add these credentials to your Terraform variables as shown in later steps.
Component Descriptions
Component | Purpose | Required? | Dependencies |
---|---|---|---|
Unleash | Feature flag management | Yes | PostgreSQL database |
External Secrets Operator | Syncs secrets from AWS Secrets Manager | Yes | AWS IAM role |
ALB Ingress Controller | AWS-native ingress controller | Yes for AWS | AWS IAM role |
Cert Manager | Certificate management | Yes | None |
OpenTelemetry Collector | Telemetry collection | No | AWS IAM role (optional) |
Presidio | PII detection and anonymization | No | None |
Deployment Steps
1. Create Values Directory
Create a directory to store the values files for the periphery components:
mkdir -p values
2. Copy Values Files
Copy the values files from the Kindo payload to your project:
cp /path/to/payload/peripheries-values/* values/
Verify that the files were copied correctly:
ls -la values/
You should see files like alb-ingress.yaml, external-secrets-operator.yaml, unleash.yaml, etc.
3. Copy Feature Flags JSON (Optional)
If you want to import feature flags to Unleash, copy the feature flags JSON file:
cp /path/to/payload/feature_flags.json .
4. Update Your Terraform Configuration
Update your existing main.tf file to include the kindo-peripheries module:
# Generate Unleash tokens and passwords for use in peripheries
resource "random_password" "unleash_admin_password" {
length = 16
special = true
}
resource "random_password" "unleash_admin_token" {
length = 32
special = false
}
resource "random_password" "unleash_client_token" {
length = 32
special = false
}
resource "random_password" "unleash_frontend_token" {
length = 32
special = false
}
# Kubernetes provider configuration
provider "kubernetes" {
host = module.kindo_infra.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.kindo_infra.eks_cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.kindo_infra.eks_cluster_name, "--region", var.region]
}
}
provider "helm" {
kubernetes {
host = module.kindo_infra.eks_cluster_endpoint
cluster_ca_certificate = base64decode(module.kindo_infra.eks_cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.kindo_infra.eks_cluster_name, "--region", var.region]
}
}
}
module "kindo_peripheries" {
source = "./modules/kindo-peripheries" # Use relative path to the module
# Pass providers to the module
providers = {
kubernetes = kubernetes
helm = helm
}
# Registry credentials
registry_username = var.registry_username
registry_password = var.registry_password
# AWS region and cluster name for various components
aws_region = var.region
cluster_name = module.kindo_infra.eks_cluster_name
# OpenTelemetry Collector Configuration (if enabled)
enable_otel_collector_cr = var.enable_otel_collector
otel_collector_iam_role_arn = module.kindo_infra.otel_collector_role_arn
otel_collector_config_region = var.region
# ExternalDNS Configuration (if enabled)
enable_external_dns = var.enable_external_dns
external_dns_iam_role_arn = module.kindo_infra.external_dns_role_arn
external_dns_domain_filter = var.base_domain
external_dns_txt_owner_id = module.kindo_infra.eks_cluster_name
# Configure which periphery components to deploy and how
peripheries_config = {
# --- ALB Ingress Controller (for AWS) --- #
alb_ingress = {
install = true
helm_chart_version = "1.7.1"
namespace = "kube-system"
create_namespace = false
values_content = templatefile("${path.module}/values/alb-ingress.yaml", {
cluster_name = module.kindo_infra.eks_cluster_name
region = var.region
controller_role_arn = module.kindo_infra.alb_controller_role_arn
})
}
# --- Cert Manager --- #
cert_manager = {
install = true
helm_chart_version = "v1.14.5"
namespace = "cert-manager"
create_namespace = true
dynamic_helm_sets = {
"installCRDs" = "true"
}
}
# --- External Secrets Operator --- #
external_secrets_operator = {
install = true
helm_chart_version = "0.9.9"
namespace = "external-secrets"
create_namespace = true
values_content = templatefile("${path.module}/values/external-secrets-operator.yaml", {
role_arn = module.kindo_infra.external_secrets_role_arn
})
secret_stores = {
"aws-secrets-manager" = {
provider = "aws"
config = {
service = "SecretsManager"
region = var.region
service_account_name = "external-secrets"
service_account_namespace = "external-secrets"
}
}
}
}
# --- Unleash (Feature Flags) --- #
unleash = {
install = true
helm_chart_version = "5.4.3"
namespace = "unleash"
create_namespace = true
values_content = templatefile("${path.module}/values/unleash.yaml", {
admin_password = random_password.unleash_admin_password.result
admin_token = random_password.unleash_admin_token.result
client_token = random_password.unleash_client_token.result
frontend_token = random_password.unleash_frontend_token.result
domain_name = var.base_domain
postgres_host = module.kindo_infra.postgres_endpoint
postgres_port = 5432
postgres_username = module.kindo_infra.postgres_username
postgres_password = module.kindo_infra.postgres_password
json_content = fileexists("${path.module}/feature_flags.json") ? file("${path.module}/feature_flags.json") : "{}"
import_project = "default"
import_environment = "development"
})
}
# --- Unleash Edge --- #
unleash_edge = {
install = true
helm_chart_version = "3.0.0"
namespace = "unleash"
create_namespace = false
values_content = templatefile("${path.module}/values/unleash-edge.yaml", {
unleash_tokens = jsonencode({
"default": random_password.unleash_client_token.result
})
domain_name = var.base_domain
})
}
# --- Presidio (PII Detection) --- #
presidio = {
install = var.enable_presidio
helm_chart_version = "2.1.95"
namespace = "presidio"
create_namespace = true
values_content = file("${path.module}/values/presidio.yaml")
}
}
depends_on = [
module.kindo_infra,
module.kindo_secrets
]
}
# Output important information for next steps
output "unleash_admin_url" {
value = "https://unleash.${var.base_domain}"
}
output "unleash_credentials" {
value = {
admin_username = "admin"
admin_password = random_password.unleash_admin_password.result
admin_token = random_password.unleash_admin_token.result
}
sensitive = true
}
output "unleash_tokens" {
value = {
client_token = random_password.unleash_client_token.result
frontend_token = random_password.unleash_frontend_token.result
}
sensitive = true
}
# Now update the secrets in AWS Secrets Manager with Unleash tokens
resource "aws_secretsmanager_secret_version" "api_config_updated" {
secret_id = aws_secretsmanager_secret.app_configs["api"].id
secret_string = replace(
module.kindo_secrets.application_configs_json["api"],
"\"UNLEASH_API_KEY\":\"placeholder_will_be_updated\"",
"\"UNLEASH_API_KEY\":\"${random_password.unleash_client_token.result}\""
)
depends_on = [
module.kindo_peripheries
]
}
resource "aws_secretsmanager_secret_version" "next_config_updated" {
secret_id = aws_secretsmanager_secret.app_configs["next"].id
secret_string = replace(
module.kindo_secrets.application_configs_json["next"],
"\"NEXT_PUBLIC_UNLEASH_FRONTEND_API_TOKEN\":\"placeholder_will_be_updated\"",
"\"NEXT_PUBLIC_UNLEASH_FRONTEND_API_TOKEN\":\"${random_password.unleash_frontend_token.result}\""
)
depends_on = [
module.kindo_peripheries
]
}
5. Add Required Variables
Add the following variables to your variables.tf file:
# Peripheries variables
variable "registry_username" {
description = "Username for Kindo container registry"
type = string
sensitive = true
}
variable "registry_password" {
description = "Password for Kindo container registry"
type = string
sensitive = true
}
variable "enable_otel_collector" {
description = "Whether to enable OpenTelemetry Collector"
type = bool
default = false
}
variable "enable_external_dns" {
description = "Whether to enable External DNS"
type = bool
default = true
}
variable "enable_presidio" {
description = "Whether to enable Presidio for PII detection"
type = bool
default = true
}
6. Update Your terraform.tfvars File
Add the Kindo registry credentials to your terraform.tfvars file:
# Registry credentials from kindo-registry.tfvars in the payload
registry_username = "robot$username" # Replace with actual value from payload
registry_password = "password" # Replace with actual value from payload
# Optional periphery components
enable_otel_collector = false # Set to true if you need telemetry collection
enable_external_dns = true
enable_presidio = true
7. Apply the Terraform Configuration in Stages
Due to the order of dependency and how Kubernetes providers work, apply the configuration in stages:
# First, apply just the Kubernetes provider configuration
terraform apply -target=provider.kubernetes -target=provider.helm
# Then apply the full configuration
terraform apply
This approach ensures that the Kubernetes providers are correctly initialized before attempting to deploy resources.
8. Understanding the Peripheries Configuration
The peripheries_config map defines which components to install and how to configure them:
install: Whether to deploy this component
helm_chart_version: The specific version of the Helm chart to use
namespace: The Kubernetes namespace to deploy into
create_namespace: Whether to create the namespace if it doesn't exist
values_content: The Helm values to use, typically from a template file
dynamic_helm_sets: Additional Helm parameters to set
9. Verify Deployment Success
After applying the Terraform configuration, verify that the components were deployed successfully:
# Check the status of each component
kubectl get pods -n unleash
kubectl get pods -n external-secrets
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-load-balancer-controller
kubectl get pods -n cert-manager
# Verify that ClusterSecretStore is created
kubectl get clustersecretstore
# Check Unleash and Unleash Edge services
kubectl get svc -n unleash
Component Details
Unleash Configuration
Unleash is a critical component that manages feature flags for Kindo. It consists of two parts:
Unleash Server: Main service with admin UI
Unleash Edge: Frontend-facing proxy for Unleash
Key configuration elements include:
Admin credentials (automatically generated)
PostgreSQL database connection
Feature flags import (if provided)
Ingress configuration
External Secrets Operator
The External Secrets Operator (ESO) syncs secrets from AWS Secrets Manager to Kubernetes. Key elements include:
IAM role configuration (using the role created by kindo-infra)
ClusterSecretStore configuration
IRSA (IAM Roles for Service Accounts) setup
ESO allows application pods to access secrets from AWS Secrets Manager as standard Kubernetes secrets.
ALB Ingress Controller
The AWS Load Balancer Controller creates Application Load Balancers based on Kubernetes Ingress resources. Key elements include:
IAM role configuration
Region and cluster name setting
Load balancer groups for internal and external traffic
Feature Flag Import
If you've included a feature_flags.json file, it will be imported into Unleash during deployment. This file contains:
Feature flag definitions
Flag strategies
Environment configurations
Default values
This ensures your Kindo deployment starts with a consistent feature flag configuration.
Connectivity Between Components
The peripheries components relate to each other as follows:
External Secrets Operator pulls secrets from AWS Secrets Manager
ALB Ingress Controller creates load balancers for internet access
Unleash provides feature flags to all applications
Cert Manager handles TLS certificates for secure communication
This architecture creates a secure foundation for deploying the Kindo applications.
Updating Unleash Tokens
After deployment, the Unleash tokens are:
Generated during Terraform execution
Stored in AWS Secrets Manager
Used to update the application configurations
This ensures that applications have the correct tokens for connecting to Unleash.
Troubleshooting
Kubernetes Provider Issues
Issue: terraform apply fails with Kubernetes provider errors. Solution: Apply in stages as described in step 7.
terraform apply -target=provider.kubernetes -target=provider.helm
Pod Startup Issues
Issue: Pods are stuck in Pending or CrashLoopBackOff state. Solution: Check events and logs for the specific pod:
kubectl describe pod -n <namespace> <pod-name>
kubectl logs -n <namespace> <pod-name>
Unleash Database Issues
Issue: Unleash fails to start due to database connection problems. Solution: Verify the PostgreSQL connection details in the Unleash configuration:
# Check Unleash pod logs
kubectl logs -n unleash deployment/unleash
# Verify database connectivity from a debug pod
kubectl run -it --rm postgres-client --image=postgres:14 --namespace=unleash -- /bin/bash
psql -h <postgres-host> -U <postgres-user> -d unleash
External Secrets Issues
Issue: Secrets are not syncing from AWS Secrets Manager. Solution: Check External Secrets Operator status and IAM role configuration:
# Check ExternalSecret status
kubectl describe externalsecret -n <namespace> <name>
# Check ClusterSecretStore
kubectl describe clustersecretstore aws-secrets-manager
# Check External Secrets Operator logs
kubectl logs -n external-secrets deployment/external-secrets-operator
Next Steps
After successfully deploying the periphery components:
Save Terraform state securely (consider using a remote backend)
Note the Unleash credentials for future reference
Proceed to Applications Deployment to deploy the core Kindo applications
Applications Deployment (kindo-applications)
This guide explains how to deploy the core Kindo applications using the kindo-applications module.
Overview
The kindo-applications module deploys the following core applications that make up the Kindo platform:
API: The main backend service
Next.js: The frontend application
LiteLLM: Proxy service for LLM providers
Llama Indexer: Document processing and indexing service
Credits: Credit management service
External Sync: External data synchronization service
External Poller: Polling service for external data sources
Audit Log Exporter: Service for exporting audit logs
Cerbos: Policy decision service
Each application is deployed as a separate Helm release within its own namespace (by default) and leverages External Secrets Operator to retrieve configuration from AWS Secrets Manager.
Prerequisites
Completed infrastructure deployment with kindo-infra
Completed secrets management with kindo-secrets
Completed peripheries deployment with kindo-peripheries
Functional EKS cluster with kubectl access
External Secrets Operator deployed and configured
Terraform 1.11.3 or later
Helm 3.x or later
Registry credentials for Kindo container registry
Verify Prerequisites
Before proceeding, verify that your environment is properly configured:
# Verify AWS and Terraform setup
aws sts get-caller-identity
terraform --version
# Verify EKS cluster access
kubectl get nodes
# Verify that periphery components are deployed
kubectl get pods -n unleash
kubectl get pods -n external-secrets
kubectl get clustersecretstore
Application Components
Application | Purpose | Required? | Dependencies |
---|---|---|---|
API | Main backend service | Yes | PostgreSQL, Redis, RabbitMQ |
Next.js | Web frontend | Yes | API service |
LiteLLM | LLM provider proxy | Yes | PostgreSQL |
Llama Indexer | Document processing | Yes | S3 storage |
Credits | Credit management | Yes | PostgreSQL |
External Sync | External data sync | Yes | RabbitMQ |
External Poller | External data polling | Yes | RabbitMQ |
Audit Log Exporter | Audit log exporting | No | S3 storage |
Cerbos | Policy decisions | Yes | None |
Deployment Steps
1. Copy Application Values Files
Copy the values files from the Kindo payload to your project:
# If using the same values directory as for peripheries
cp /path/to/payload/application-values/* values/
# Verify that the files were copied
ls -la values/
You should see files like api.yaml, next.yaml, litellm.yaml, etc., alongside the periphery values files.
2. Update Your Terraform Configuration
Update your existing main.tf file to include the kindo-applications module:
module "kindo_applications" {
source = "./modules/kindo-applications" # Use relative path to the module
# Pass providers to the module
providers = {
kubernetes = kubernetes
helm = helm
}
# Registry credentials
registry_username = var.registry_username
registry_password = var.registry_password
# Required kubernetes_config_context parameter
# For initial deployment, use a placeholder value that will be updated later
kubernetes_config_context = "placeholder-context" # IMPORTANT: Required even during planning phase
# Configure which applications to deploy and how
applications_config = {
# --- API Service --- #
api = {
install = true
helm_chart_version = "1.0.0"
namespace = "api"
create_namespace = true
values_content = templatefile("${path.module}/values/api.yaml", {
domain_name = var.base_domain
environment_name = var.environment
project_name = var.project
})
}
# ... other applications configuration ...
}
# Helm deployment settings
helm_wait = true
helm_atomic = true
helm_timeout = 600
depends_on = [
module.kindo_peripheries
]
}
Important Note on Kubernetes Config Context
The kubernetes_config_context parameter is required by the kindo-applications module for validation during the Terraform plan phase, even if the EKS cluster doesn't exist yet. During initial deployment:
Provide a placeholder value like "placeholder-context" for the initial terraform plan and terraform apply
After the EKS cluster is created, this value should be updated to the actual context name, which is typically in the format arn:aws:eks:<region>:<account-id>:cluster/<cluster-name>
You can obtain the actual context name after cluster creation with:
aws eks update-kubeconfig --name <cluster-name> --region <region>
kubectl config current-context
This parameter is essential for proper connection to the Kubernetes cluster by the applications module.
3. Add Required Variables
Add the following variable to your variables.tf file if not already present:
# Applications variables
variable "enable_audit_log_exporter" {
description = "Whether to enable the Audit Log Exporter service"
type = bool
default = true
}
4. Update Your terraform.tfvars File
Add the application-specific variable to your terraform.tfvars file:
# Optional applications
enable_audit_log_exporter = true # Set to false if you don't need audit log exporting
5. Apply the Terraform Configuration
Apply the Terraform configuration to deploy the applications:
terraform apply
The deployment may take 10-15 minutes to complete. Each application will be deployed as a separate Helm release.
6. Update the Kubernetes Config Context After Deployment
After the EKS cluster is successfully created and you've configured kubectl to access it, update the kubernetes_config_context parameter in your configuration:
# Get the current context
KUBE_CONTEXT=$(kubectl config current-context)
# Update the terraform.tfvars file or use environment variables
# Example: export TF_VAR_kubernetes_config_context="$KUBE_CONTEXT"
Then reapply the Terraform configuration:
terraform apply
This ensures that the applications module has the correct context for interacting with the Kubernetes cluster.
Understanding the Applications Configuration
The applications_config map defines which applications to install and how to configure them:
install: Whether to deploy this application
helm_chart_version: The specific version of the Helm chart to use
namespace: The Kubernetes namespace to deploy into
create_namespace: Whether to create the namespace if it doesn't exist
values_content: The Helm values to use, typically from a template file
Each application's values file specifies:
Image repository and tag
Resource requests and limits
Ingress configuration
External Secret reference
Service configuration
Node selector for proper placement
Component Integration
Secret Management Integration
Each application uses the External Secrets Operator to retrieve its configuration from AWS Secrets Manager:
# Example from values/api.yaml
externalSecrets:
enabled: true
secretStoreName: "aws-secrets-manager"
secretStoreKind: "ClusterSecretStore"
refreshInterval: "30s"
secretKey: "${project_name}-${environment_name}/api-app-config"
name: "api-env"
targetName: "api-env"
The External Secret pulls the configuration from AWS Secrets Manager and creates a Kubernetes Secret that the application pod mounts as environment variables.
Ingress Configuration
Applications use the ALB Ingress Controller to expose services externally:
# Example from values/api.yaml
ingress:
defaults:
tls: true
tlsSecretName: ""
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/healthcheck-path: /healthcheck
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
Different applications use different load balancer groups:
Frontend (Next.js): external-alb group for internet access
Backend (API, LiteLLM): shared-alb group for internal access
Networking Architecture
The applications follow this networking architecture:
External Traffic: Internet → ALB → Next.js
Internal Traffic: Next.js → ALB → API
Backend Services: API → Services (LiteLLM, Credits, etc.)
This design ensures that only necessary components are exposed to the internet.
Application-Specific Notes
API Service
The API service is the central backend service for Kindo. It:
Handles all user requests from the frontend
Interacts with the database, cache, and message broker
Manages business logic and data access
Connects to other services like LiteLLM and Credits
Next.js Frontend
The Next.js frontend is the user interface for Kindo. It:
Provides the web UI for users
Communicates with the API service
Handles client-side rendering
Manages user authentication
LiteLLM Proxy
The LiteLLM service is a proxy for language model providers. It:
Routes requests to Anthropic, OpenAI, or other LLM providers
Handles API key management and rate limiting
Provides a unified interface for all LLM interactions
Maintains usage statistics
Verification Steps
1. Check Deployment Status
Verify that all applications are deployed successfully:
# Check overall deployment status
kubectl get pods --all-namespaces | grep -E 'api|next|litellm|llama|credits|external|cerbos'
# Check specific applications
kubectl get all -n api
kubectl get all -n next
kubectl get all -n litellm
2. Verify External Secrets
Check that External Secrets are correctly syncing with AWS Secrets Manager:
# List all ExternalSecret resources
kubectl get externalsecret --all-namespaces
# Check the status of a specific ExternalSecret
kubectl describe externalsecret -n api api-env
The status should show that the secret was synchronized successfully.
3. Check Ingress Resources
Verify that ingress resources are created correctly:
# List all ingress resources
kubectl get ingress --all-namespaces
# Describe a specific ingress
kubectl describe ingress -n api api
4. Verify Load Balancers
Check that the AWS Load Balancer Controller has created load balancers:
# List AWS load balancers
aws elbv2 describe-load-balancers --region <your-region> --query 'LoadBalancers[*].{Name:LoadBalancerName,DNSName:DNSName,State:State.Code}'
5. Test Application Access
After the DNS records have propagated, test accessing the applications:
# Test API health check
curl -I https://api.<your-domain>/healthcheck
# Access the frontend
open https://app.<your-domain>
Customization Options
1. Resource Requirements
You can adjust the CPU and memory requirements for each application by modifying its values file:
resources:
requests:
cpu: "1000m"
memory: "2000Mi"
limits:
cpu: "2000m"
memory: "4000Mi"
Higher resource requests ensure better performance but require more node capacity.
2. Application Scaling
You can configure autoscaling for applications:
replicaCount: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 80
This allows applications to handle varying load levels automatically.
3. Node Placement
You can control which node group your applications run on:
nodeSelector:
WorkloadType: compute-optimized
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- api
topologyKey: kubernetes.io/hostname
This ensures optimal resource usage and availability.
Troubleshooting
1. Pod Startup Issues
If pods are failing to start:
# Check pod status
kubectl get pods -n <namespace>
# Get detailed information about a failing pod
kubectl describe pod -n <namespace> <pod-name>
# Check pod logs
kubectl logs -n <namespace> <pod-name>
Common issues include:
Resource constraints
Image pull errors
Configuration errors
Volume mount issues
2. Secret Access Issues
If applications can't access their secrets:
# Check ExternalSecret status
kubectl describe externalsecret -n <namespace> <name>
# Verify that the Kubernetes secret exists
kubectl get secret -n <namespace> <name>
# Check the application's environment variables
kubectl exec -it -n <namespace> <pod-name> -- env | grep -i 'database\|redis\|api'
3. Ingress or Network Issues
If you can't access services through the ingress:
# Check the ingress configuration
kubectl describe ingress -n <namespace> <name>
# Check the ALB Ingress Controller logs
kubectl logs -n kube-system deployment/aws-load-balancer-controller
# Check the load balancer's target groups
aws elbv2 describe-target-groups --region <region>
aws elbv2 describe-target-health --target-group-arn <target-group-arn> --region <region>
4. Application-Specific Issues
For issues specific to certain applications:
# Check the API logs for database connection issues
kubectl logs -n api deployment/api | grep -i "database\|connection\|error"
# Check the LiteLLM logs for LLM provider issues
kubectl logs -n litellm deployment/litellm | grep -i "anthropic\|openai\|error"
5. Kubernetes Context Issues
If you encounter issues related to the Kubernetes context:
# Verify the current kubectl context
kubectl config current-context
# List available contexts
kubectl config get-contexts
# Update the Terraform configuration with the correct context
# Then run: terraform apply
Next Steps
After successfully deploying all applications:
Save Terraform state securely (consider using a remote backend)
Configure DNS records to point to the ALB endpoints
Set up a CI/CD pipeline for ongoing management
Implement monitoring and alerting
Develop a backup and recovery strategy