Applications Deployment Guide
This guide provides detailed instructions for deploying Kindo application services using the kindo-applications module as a separate Terraform stack.
Table of Contents
Overview
The kindo-applications module deploys the core Kindo services as a separate Terraform stack:
API Service: Backend REST API (Node.js)
Next.js Frontend: Web application UI
LiteLLM: AI model proxy and router
Llama Indexer: Document indexing service
External Poller: Background job processor
External Sync: Data synchronization service
Credits Service: Usage tracking and billing
Audit Log Exporter: Compliance and logging
Cerbos: Authorization policy engine
Each application: - Runs in its own Kubernetes namespace - Uses External Secrets for configuration - Includes health checks and monitoring - Supports horizontal scaling
Pre-Deployment Requirements
Required Infrastructure
Before deploying applications, ensure you have:
✅ Base stack deployed (infrastructure + secrets + peripheries)
EKS cluster must be running
External Secrets Operator must be configured
ALB Ingress Controller must be deployed
Secrets must exist in AWS Secrets Manager
✅ DNS properly configured
Base domain delegation completed
Wildcard certificate created
✅ Access to infrastructure outputs
Note the cluster name, region, and other outputs from base stack
Verify Prerequisites
# Update kubeconfig
aws eks update-kubeconfig --name <cluster-name> --region <region> --profile <profile>
# Check External Secrets Operator
kubectl get clustersecretstore
# Should show: aws-secrets-manager Ready
# Check secrets in AWS
aws secretsmanager list-secrets --query "SecretList[?contains(Name, 'kindo-prod')].[Name]" --output table --profile <profile>
# Check ingress controller
kubectl get ingressclass
# Should show: alb
# Test Unleash connectivity
curl -s https://unleash.yourdomain.com/api/client/features | jq .
Directory Setup
Create a separate directory for the applications deployment:
# Create applications deployment directory
mkdir -p my-kindo-deployment/kindo-applications
cd my-kindo-deployment/kindo-applications
# Copy example values files (adjust path as needed)
cp -r ../../application-values ./values
# Directory structure should look like:
# kindo-applications/
# ├── main.tf
# ├── provider.tf
# ├── variables.tf
# ├── outputs.tf
# ├── terraform.tfvars
# ├── registry_secrets.tf
# └── values/
# ├── api.yaml
# ├── next.yaml
# ├── litellm.yaml
# └── ...
Configuration Setup
1. Create main.tf
terraform {
required_version = ">= 1.11.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 5.0"
}
helm = {
source = "hashicorp/helm"
version = "2.17.0"
}
kubectl = {
source = "gavinbunney/kubectl"
version = ">= 1.14.0"
}
time = {
source = "hashicorp/time"
version = ">= 0.9.0"
}
}
# Configure your state backend - Optional
# For production, uncomment and configure the S3 backend:
# backend "s3" {
# bucket = "my-terraform-state-bucket"
# key = "kindo/applications/terraform.tfstate"
# region = "us-east-1"
#
# dynamodb_table = "terraform-state-lock"
# encrypt = true
# }
}
locals {
# Core application settings
project = var.project_name
environment = var.environment_name
domain_name = var.domain_name
# Secret naming pattern (for External Secrets Operator)
secret_pattern = "%s-%s/%s-app-config"
}
# Deploy Kindo Applications
module "kindo_applications" {
source = "../../modules/kindo-applications" # Adjust path as needed
# Don't wait for resources to be healthy
helm_wait = false
helm_atomic = false
# Helm registry credentials (from shared.tfvars)
registry_url = var.registry_url
registry_username = var.registry_username
registry_password = var.registry_password
# Application configurations
applications_config = {
# --- API Service --- #
api = {
install = var.enable_api
helm_chart_version = var.api_chart_version
namespace = "api" # Use dedicated namespace for each application
create_namespace = true # Create namespace automatically
values_content = var.api_values_content != "" ? var.api_values_content : templatefile("${path.module}/values/api.yaml", {
domain_name = local.domain_name
replica_count = var.api_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.api_replica_count)
}, var.api_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "api")
}, var.api_sensitive_helm_sets)
},
# --- Next.js Frontend --- #
next = {
install = var.enable_next
helm_chart_version = var.next_chart_version
namespace = "next" # Use dedicated namespace
create_namespace = true # Create namespace automatically
values_content = var.next_values_content != "" ? var.next_values_content : templatefile("${path.module}/values/next.yaml", {
domain_name = local.domain_name
replica_count = var.next_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.next_replica_count)
}, var.next_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "next")
}, var.next_sensitive_helm_sets)
},
# --- LiteLLM --- #
litellm = {
install = var.enable_litellm
helm_chart_version = var.litellm_chart_version
namespace = "litellm"
create_namespace = true # Create namespace automatically
values_content = var.litellm_values_content != "" ? var.litellm_values_content : templatefile("${path.module}/values/litellm.yaml", {
domain_name = local.domain_name
replica_count = var.litellm_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.litellm_replica_count)
}, var.litellm_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "litellm")
}, var.litellm_sensitive_helm_sets)
},
# --- Llama Indexer --- #
llama_indexer = {
install = var.enable_llama_indexer
helm_chart_version = var.llama_indexer_chart_version
namespace = "llama-indexer"
create_namespace = true # Create namespace automatically
values_content = var.llama_indexer_values_content != "" ? var.llama_indexer_values_content : templatefile("${path.module}/values/llama-indexer.yaml", {
domain_name = local.domain_name
replica_count = var.llama_indexer_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.llama_indexer_replica_count)
}, var.llama_indexer_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "llama-indexer")
}, var.llama_indexer_sensitive_helm_sets)
},
# --- Credits --- #
credits = {
install = var.enable_credits
helm_chart_version = var.credits_chart_version
namespace = "credits"
create_namespace = true # Create namespace automatically
values_content = var.credits_values_content != "" ? var.credits_values_content : templatefile("${path.module}/values/credits.yaml", {
domain_name = local.domain_name
replica_count = var.credits_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.credits_replica_count)
}, var.credits_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "credits")
}, var.credits_sensitive_helm_sets)
},
# --- External Sync --- #
external_sync = {
install = var.enable_external_sync
helm_chart_version = var.external_sync_chart_version
namespace = "external-sync"
create_namespace = true # Create namespace automatically
values_content = var.external_sync_values_content != "" ? var.external_sync_values_content : templatefile("${path.module}/values/external-sync.yaml", {
domain_name = local.domain_name
replica_count = var.external_sync_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.external_sync_replica_count)
}, var.external_sync_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "external-sync")
}, var.external_sync_sensitive_helm_sets)
},
# --- External Poller --- #
external_poller = {
install = var.enable_external_poller
helm_chart_version = var.external_poller_chart_version
namespace = "external-poller"
create_namespace = true # Create namespace automatically
values_content = var.external_poller_values_content != "" ? var.external_poller_values_content : templatefile("${path.module}/values/external-poller.yaml", {
domain_name = local.domain_name
replica_count = var.external_poller_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.external_poller_replica_count)
}, var.external_poller_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "external-poller")
}, var.external_poller_sensitive_helm_sets)
},
# --- Audit Log Exporter --- #
audit_log_exporter = {
install = var.enable_audit_log_exporter
helm_chart_version = var.audit_log_exporter_chart_version
namespace = "audit-log-exporter"
create_namespace = true # Create namespace automatically
values_content = var.audit_log_exporter_values_content != "" ? var.audit_log_exporter_values_content : templatefile("${path.module}/values/audit-log-exporter.yaml", {
domain_name = local.domain_name
replica_count = var.audit_log_exporter_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.audit_log_exporter_replica_count)
}, var.audit_log_exporter_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "audit-log-exporter")
}, var.audit_log_exporter_sensitive_helm_sets)
},
# --- Cerbos --- #
cerbos = {
install = var.enable_cerbos
helm_chart_version = var.cerbos_chart_version
namespace = "cerbos"
create_namespace = true # Create namespace automatically
values_content = var.cerbos_values_content != "" ? var.cerbos_values_content : templatefile("${path.module}/values/cerbos.yaml", {
domain_name = local.domain_name
replica_count = var.cerbos_replica_count
environment_name = local.environment
project_name = local.project
})
dynamic_helm_sets = merge({
"replicaCount" = tostring(var.cerbos_replica_count)
}, var.cerbos_helm_sets)
sensitive_helm_sets = merge({
"secretRef.name" = format(local.secret_pattern, local.project, local.environment, "cerbos")
}, var.cerbos_sensitive_helm_sets)
}
}
}
2. Create provider.tf
# provider.tf - Provider configuration for applications deployment
provider "aws" {
region = var.region
profile = var.aws_profile
}
# Configure Helm provider with EKS authentication
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = [
"eks",
"get-token",
"--cluster-name",
var.cluster_name,
"--region",
var.region
]
env = {
AWS_PROFILE = var.aws_profile
}
}
}
}
# Configure kubectl provider with EKS authentication
provider "kubectl" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
load_config_file = false
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = [
"eks",
"get-token",
"--cluster-name",
var.cluster_name,
"--region",
var.region
]
env = {
AWS_PROFILE = var.aws_profile
}
}
}
# Data sources to get EKS cluster details
data "aws_eks_cluster" "cluster" {
name = var.cluster_name
}
data "aws_eks_cluster_auth" "cluster" {
name = var.cluster_name
}
3. Create registry_secrets.tf
# Registry credentials secrets for each application namespace
# These secrets allow pulling images from the Kindo registry
locals {
# Get application namespaces that need to be created
application_namespaces = {
"api" = var.enable_api ? "api" : null
"next" = var.enable_next ? "next" : null
"litellm" = var.enable_litellm ? "litellm" : null
"llama-indexer" = var.enable_llama_indexer ? "llama-indexer" : null
"credits" = var.enable_credits ? "credits" : null
"external-sync" = var.enable_external_sync ? "external-sync" : null
"external-poller" = var.enable_external_poller ? "external-poller" : null
"audit-log-exporter" = var.enable_audit_log_exporter ? "audit-log-exporter" : null
"cerbos" = var.enable_cerbos ? "cerbos" : null
}
# Filter out null values
enabled_namespaces = {
for name, ns in local.application_namespaces : name => ns
if ns != null
}
# Create the docker config JSON once to reuse
# Use a fixed registry domain for Docker authentication
dockerconfig_json = jsonencode({
auths = {
"registry.kindo.ai" = {
username = var.registry_username
password = var.registry_password
auth = base64encode("${var.registry_username}:${var.registry_password}")
}
}
})
}
# Add a small delay to ensure namespaces are fully propagated in Kubernetes
resource "time_sleep" "wait_for_namespaces" {
depends_on = [module.kindo_applications]
create_duration = "5s"
}
# Create registry credential secrets in each namespace
resource "kubectl_manifest" "registry_credentials" {
for_each = local.enabled_namespaces
yaml_body = <<YAML
apiVersion: v1
kind: Secret
metadata:
name: registry-credentials
namespace: ${each.value}
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: ${base64encode(local.dockerconfig_json)}
YAML
# Only destroy if not using an override, set to true to ignore this
force_new = true
server_side_apply = true
# Make sure the namespaces are ready before creating secrets
depends_on = [
time_sleep.wait_for_namespaces
]
}
4. Create variables.tf
# variables.tf - Variable definitions for applications deployment
# --- Core Variables --- #
variable "project_name" {
description = "Project name used for resource naming and tagging (set via shared.tfvars)"
type = string
}
variable "environment_name" {
description = "Environment name (e.g., dev, staging, prod) (set via shared.tfvars)"
type = string
}
variable "domain_name" {
description = "Base domain name for application endpoints"
type = string
}
# --- AWS Configuration --- #
variable "region" {
description = "AWS region"
type = string
}
variable "aws_profile" {
description = "AWS profile for authentication"
type = string
}
variable "cluster_name" {
description = "EKS cluster name"
type = string
}
# --- Registry Configuration --- #
variable "registry_url" {
description = "OCI registry URL for Helm charts (set via shared.tfvars)"
type = string
default = "oci://registry.kindo.ai/kindo-helm"
}
variable "registry_username" {
description = "Username for the Helm OCI registry (SENSITIVE - set via shared.tfvars)"
type = string
sensitive = true
}
variable "registry_password" {
description = "Password for the Helm OCI registry (SENSITIVE - set via shared.tfvars)"
type = string
sensitive = true
}
# --- API Service --- #
variable "enable_api" {
description = "Whether to install the API service"
type = bool
default = true
}
variable "api_chart_version" {
description = "Version of the API Helm chart"
type = string
}
variable "api_replica_count" {
description = "Number of API service replicas"
type = number
default = 2
}
variable "api_values_content" {
description = "Custom values content for API service"
type = string
default = ""
}
variable "api_helm_sets" {
description = "Additional Helm values to set for API service"
type = map(string)
default = {}
}
variable "api_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for API service"
type = map(string)
default = {}
}
# --- Next.js Frontend --- #
variable "enable_next" {
description = "Whether to install the Next.js frontend"
type = bool
default = true
}
variable "next_chart_version" {
description = "Version of the Next.js Helm chart"
type = string
}
variable "next_replica_count" {
description = "Number of Next.js frontend replicas"
type = number
default = 2
}
variable "next_values_content" {
description = "Custom values content for Next.js frontend"
type = string
default = ""
}
variable "next_helm_sets" {
description = "Additional Helm values to set for Next.js frontend"
type = map(string)
default = {}
}
variable "next_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for Next.js frontend"
type = map(string)
default = {}
}
# --- LiteLLM --- #
variable "enable_litellm" {
description = "Whether to install the LiteLLM service"
type = bool
default = true
}
variable "litellm_chart_version" {
description = "Version of the LiteLLM Helm chart"
type = string
}
variable "litellm_replica_count" {
description = "Number of LiteLLM service replicas"
type = number
default = 2
}
variable "litellm_values_content" {
description = "Custom values content for LiteLLM service"
type = string
default = ""
}
variable "litellm_helm_sets" {
description = "Additional Helm values to set for LiteLLM service"
type = map(string)
default = {}
}
variable "litellm_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for LiteLLM service"
type = map(string)
default = {}
}
# --- Llama Indexer --- #
variable "enable_llama_indexer" {
description = "Whether to install the Llama Indexer service"
type = bool
default = true
}
variable "llama_indexer_chart_version" {
description = "Version of the Llama Indexer Helm chart"
type = string
}
variable "llama_indexer_replica_count" {
description = "Number of Llama Indexer service replicas"
type = number
default = 1
}
variable "llama_indexer_values_content" {
description = "Custom values content for Llama Indexer service"
type = string
default = ""
}
variable "llama_indexer_helm_sets" {
description = "Additional Helm values to set for Llama Indexer service"
type = map(string)
default = {}
}
variable "llama_indexer_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for Llama Indexer service"
type = map(string)
default = {}
}
# --- Credits --- #
variable "enable_credits" {
description = "Whether to install the Credits service"
type = bool
default = true
}
variable "credits_chart_version" {
description = "Version of the Credits Helm chart"
type = string
}
variable "credits_replica_count" {
description = "Number of Credits service replicas"
type = number
default = 1
}
variable "credits_values_content" {
description = "Custom values content for Credits service"
type = string
default = ""
}
variable "credits_helm_sets" {
description = "Additional Helm values to set for Credits service"
type = map(string)
default = {}
}
variable "credits_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for Credits service"
type = map(string)
default = {}
}
# --- External Sync --- #
variable "enable_external_sync" {
description = "Whether to install the External Sync service"
type = bool
default = true
}
variable "external_sync_chart_version" {
description = "Version of the External Sync Helm chart"
type = string
}
variable "external_sync_replica_count" {
description = "Number of External Sync service replicas"
type = number
default = 1
}
variable "external_sync_values_content" {
description = "Custom values content for External Sync service"
type = string
default = ""
}
variable "external_sync_helm_sets" {
description = "Additional Helm values to set for External Sync service"
type = map(string)
default = {}
}
variable "external_sync_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for External Sync service"
type = map(string)
default = {}
}
# --- External Poller --- #
variable "enable_external_poller" {
description = "Whether to install the External Poller service"
type = bool
default = true
}
variable "external_poller_chart_version" {
description = "Version of the External Poller Helm chart"
type = string
}
variable "external_poller_replica_count" {
description = "Number of External Poller service replicas"
type = number
default = 1
}
variable "external_poller_values_content" {
description = "Custom values content for External Poller service"
type = string
default = ""
}
variable "external_poller_helm_sets" {
description = "Additional Helm values to set for External Poller service"
type = map(string)
default = {}
}
variable "external_poller_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for External Poller service"
type = map(string)
default = {}
}
# --- Audit Log Exporter --- #
variable "enable_audit_log_exporter" {
description = "Whether to install the Audit Log Exporter service"
type = bool
default = true
}
variable "audit_log_exporter_chart_version" {
description = "Version of the Audit Log Exporter Helm chart"
type = string
}
variable "audit_log_exporter_replica_count" {
description = "Number of Audit Log Exporter service replicas"
type = number
default = 1
}
variable "audit_log_exporter_values_content" {
description = "Custom values content for Audit Log Exporter service"
type = string
default = ""
}
variable "audit_log_exporter_helm_sets" {
description = "Additional Helm values to set for Audit Log Exporter service"
type = map(string)
default = {}
}
variable "audit_log_exporter_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for Audit Log Exporter service"
type = map(string)
default = {}
}
# --- Cerbos --- #
variable "enable_cerbos" {
description = "Whether to install the Cerbos service"
type = bool
default = true
}
variable "cerbos_chart_version" {
description = "Version of the Cerbos Helm chart"
type = string
}
variable "cerbos_replica_count" {
description = "Number of Cerbos service replicas"
type = number
default = 2
}
variable "cerbos_values_content" {
description = "Custom values content for Cerbos service"
type = string
default = ""
}
variable "cerbos_helm_sets" {
description = "Additional Helm values to set for Cerbos service"
type = map(string)
default = {}
}
variable "cerbos_sensitive_helm_sets" {
description = "Additional sensitive Helm values to set for Cerbos service"
type = map(string)
default = {}
}
5. Create terraform.tfvars
# terraform.tfvars - Application-specific configuration
# Application replica counts (override defaults as needed)
api_replica_count = 1
next_replica_count = 2
litellm_replica_count = 1
# Chart Versions
api_chart_version = "0.0.15"
next_chart_version = "0.0.15"
litellm_chart_version = "0.0.15"
llama_indexer_chart_version = "0.0.15"
credits_chart_version = "0.0.15"
external_sync_chart_version = "0.0.15"
external_poller_chart_version = "0.0.15"
audit_log_exporter_chart_version = "0.0.15"
cerbos_chart_version = "0.0.15"
# Application-specific settings can be configured in values/ files
6. Create outputs.tf
# outputs.tf - Output values from applications deployment
output "app_endpoint" {
description = "API service endpoint"
value = "https://app.${var.domain_name}"
}
output "api_endpoint" {
description = "API service endpoint"
value = "https://api.${var.domain_name}"
}
output "deployment_summary" {
description = "Summary of deployed applications"
value = {
api = var.enable_api ? "Deployed" : "Skipped"
next = var.enable_next ? "Deployed" : "Skipped"
litellm = var.enable_litellm ? "Deployed" : "Skipped"
llama_indexer = var.enable_llama_indexer ? "Deployed" : "Skipped"
credits = var.enable_credits ? "Deployed" : "Skipped"
external_sync = var.enable_external_sync ? "Deployed" : "Skipped"
external_poller = var.enable_external_poller ? "Deployed" : "Skipped"
audit_log_exporter = var.enable_audit_log_exporter ? "Deployed" : "Skipped"
cerbos = var.enable_cerbos ? "Deployed" : "Skipped"
}
}
Application Configuration
Understanding Values Files
Each application has a values file in the values/ directory that configures:
Resource Requirements
Scaling Parameters
Environment-Specific Settings
Health Checks
Ingress Rules
API Service Configuration (values/api.yaml)
replicaCount: ${replica_count}
image:
repository: registry.kindo.ai/kindo-docker/api
pullPolicy: IfNotPresent
pullSecret: registry-credentials
secretName: api-env
service:
applicationPort: 8000
port: 80
type: ClusterIP
ingress:
defaults:
tls: true
tlsSecretName: ""
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/healthcheck-path: /healthcheck
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
main:
hosts:
- api.${domain_name}
paths: ["/"]
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
livenessProbe:
httpGet:
path: /healthcheck
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthcheck
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
# Node affinity for specific workloads
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node.kubernetes.io/instance-type
operator: In
values:
- m5.large
- m5.xlarge
Next.js Frontend Configuration (values/next.yaml)
replicaCount: ${replica_count}
image:
repository: registry.kindo.ai/kindo-docker/next
pullPolicy: IfNotPresent
pullSecret: registry-credentials
secretName: next-env
service:
applicationPort: 3000
port: 80
type: ClusterIP
ingress:
defaults:
tls: true
tlsSecretName: ""
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/healthcheck-path: /
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
# Frontend gets priority routing
alb.ingress.kubernetes.io/group.name: kindo-apps
alb.ingress.kubernetes.io/group.order: "100"
main:
hosts:
- app.${domain_name}
paths: ["/"]
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
# Environment variables (non-sensitive)
env:
- name: NEXT_PUBLIC_API_URL
value: "https://api.${domain_name}"
- name: NODE_ENV
value: "production"
LiteLLM Configuration (values/litellm.yaml)
replicaCount: ${replica_count}
image:
repository: registry.kindo.ai/kindo-docker/litellm
pullPolicy: IfNotPresent
pullSecret: registry-credentials
secretName: litellm-env
service:
applicationPort: 4000
port: 80
type: ClusterIP
ingress:
defaults:
tls: true
tlsSecretName: ""
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/healthcheck-path: /health
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
main:
hosts:
- litellm.${domain_name}
paths: ["/"]
resources:
requests:
memory: "1Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "4000m"
# LiteLLM specific settings
persistence:
enabled: true
size: 10Gi
storageClass: gp3
# Model configuration will come from External Secrets
Worker Services Configuration
For background workers (external-poller, external-sync):
replicaCount: ${replica_count}
image:
repository: registry.kindo.ai/kindo-docker/external-poller # or external-sync
pullPolicy: IfNotPresent
pullSecret: registry-credentials
secretName: external-poller-env # or external-sync-env
service:
applicationPort: 8000
port: 80
type: ClusterIP
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
# No ingress for workers
ingress:
enabled: false
# Health checks for workers
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
# Job-specific configuration via environment
env:
- name: WORKER_CONCURRENCY
value: "5"
- name: POLL_INTERVAL
value: "60"
Deployment Process
1. Initialize Terraform
# Initialize providers
terraform init
# Verify configuration
terraform validate
2. Plan Deployment
# Plan with all configuration files
terraform plan -var-file="../../shared.tfvars" -var-file="terraform.tfvars"
# Review the plan carefully:
# - Check namespace creation
# - Verify secret references
# - Confirm ingress configuration
# - Review resource allocations
3. Deploy Applications
# Apply configuration
terraform apply -var-file="../../shared.tfvars" -var-file="terraform.tfvars"
# Deployment takes 5-10 minutes depending on image sizes
4. Run initial migrations
kubectl exec -n api deployment/api -- npx prisma migrate deploy --schema /app/backend/api/node_modules/.prisma/client/schema.prisma
5. Monitor Deployment
# Watch namespace creation
kubectl get namespaces -w
# Monitor pod rollout by namespace
for ns in api next litellm llama-indexer credits external-sync external-poller audit-log-exporter cerbos; do
echo "Checking namespace: $ns"
kubectl get pods -n $ns
done
# Check deployment status
for ns in api next litellm; do
echo "=== $ns ==="
kubectl rollout status deployment -n $ns --timeout=300s
done
Post-Deployment Verification
1. Verify All Pods Running
# Check pod status across application namespaces
for ns in api next litellm llama-indexer credits external-sync external-poller audit-log-exporter cerbos; do
echo "=== Namespace: $ns ==="
kubectl get pods -n $ns
done
# Should see all pods in Running state with correct READY count
2. Verify External Secrets
# Check if secrets were created
for ns in api next litellm; do
echo "=== Secrets in $ns ==="
kubectl get secrets -n $ns
done
# Verify External Secrets are synced
for ns in api next litellm; do
echo "=== ExternalSecret status in $ns ==="
kubectl get externalsecrets -n $ns
done
# Verify secret content (without exposing sensitive data)
kubectl get secret -n api api-env -o jsonpath='{.data}' | jq 'keys'
3. Test Application Endpoints
# Get ingress endpoints
kubectl get ingress -A
# Test API health
curl -s https://api.yourdomain.com/health | jq .
# Test frontend
curl -I https://app.yourdomain.com
# Test LiteLLM
curl -s https://api.yourdomain.com/litellm/health | jq .
4. Check Application Logs
# API logs
kubectl logs -n api -l app.kubernetes.io/name=api --tail=50
# Frontend logs
kubectl logs -n next -l app.kubernetes.io/name=next --tail=50
# Check for errors
kubectl logs -n api -l app.kubernetes.io/name=api --tail=100 | grep -i error
5. Verify Integrations
# Check database connectivity
kubectl exec -n api deploy/api -- sh -c 'echo "Database connection test"'
# Check Redis connectivity
kubectl exec -n api deploy/api -- sh -c 'echo "Redis connection test"'
# Check Unleash integration
kubectl exec -n api deploy/api -- curl -s http://unleash-edge.unleash-edge:3063/api/client/features
Troubleshooting
Common Issues
Pods Stuck in Pending
0/3 nodes are available: 3 Insufficient cpu
Solution: Check node resources, adjust resource requests, or scale node group:
kubectl describe nodes
kubectl top nodes
ImagePullBackOff
Failed to pull image: unauthorized
Solution: Verify registry secret:
kubectl get secret -n api registry-credentials -o json | jq '.data.".dockerconfigjson"' | base64 -d | jq .
External Secret Not Found
SecretStore default/aws-secrets-manager, resource not found
Solution: Ensure External Secrets Operator is deployed and ClusterSecretStore exists:
kubectl get clustersecretstore
Ingress Not Creating ALB
No LoadBalancer found for ingress
Solution: Check ALB controller logs and annotations:
kubectl logs -n kube-system deployment/aws-load-balancer-controller
kubectl describe ingress -n api
Health Check Failures
# Debug liveness probe failures
kubectl describe pod -n api <pod-name>
# Test health endpoint from inside pod
kubectl exec -n api deploy/api -- curl -s localhost:8000/healthcheck
# Check environment variables
kubectl exec -n api deploy/api -- env | sort
Performance Issues
# Check resource usage by namespace
for ns in api next litellm; do
echo "=== Resource usage in $ns ==="
kubectl top pods -n $ns
done
# Review HPA status
kubectl get hpa -A
# Check for throttling
kubectl describe pod -n api <pod-name> | grep -A5 "Conditions:"
Best Practices
1. Production Configuration
Set appropriate resource requests and limits
Enable autoscaling for variable workloads
Configure pod disruption budgets
Use anti-affinity for high availability
2. Security
Regularly update container images
Scan images for vulnerabilities
Use network policies to restrict traffic
Enable audit logging
3. Monitoring
Export metrics to Prometheus
Set up alerts for key metrics
Monitor application logs
Track error rates and latencies
4. Cost Optimization
Right-size resource allocations based on actual usage
Use spot instances for non-critical workloads
Implement aggressive autoscaling policies
Clean up unused resources
Next Steps
After successful application deployment:
Configure Monitoring: Set up observability stack
Load Testing: Validate performance under load
Backup Procedures: Implement data backup strategies
Documentation: Document your specific configurations
Runbooks: Create operational procedures