Skip to content

Troubleshooting

This guide covers common issues encountered when operating a Self-Managed Kindo installation and how to resolve them.

Service Health

Pods Not Starting

SymptomLikely CauseSolution
ImagePullBackOffRegistry credentials missing or expiredVerify imagePullSecrets in your Helm values and renew registry tokens
CrashLoopBackOffMissing environment variables or bad configCheck pod logs with kubectl logs <pod> for the specific error
PendingInsufficient cluster resourcesCheck node capacity with kubectl describe nodes and scale up if needed
Init:ErrorDatabase migration failedCheck init container logs — usually a connectivity or permissions issue

Checking Service Logs

Terminal window
# View logs for a specific service
kubectl logs -n kindo deployment/api --tail=100
# Follow logs in real time
kubectl logs -n kindo deployment/api -f
# View previous container logs (after a crash)
kubectl logs -n kindo deployment/api --previous

Database Issues

PostgreSQL Connection Failures

If services cannot connect to PostgreSQL:

  1. Verify the database is running: kubectl get pods -n postgres
  2. Check connection string: Ensure DATABASE_URL in your secrets matches the actual PostgreSQL host, port, and credentials.
  3. Test connectivity: Run a psql command from within the cluster to verify network access.
  4. Check max connections: Under heavy load, PostgreSQL may run out of connections. Increase max_connections or use PgBouncer.

Migration Errors

If a service fails to start with a migration error:

  1. Check the pod logs for the specific migration that failed.
  2. Verify the database user has CREATE, ALTER, and INSERT permissions on the target schema.
  3. If a migration is partially applied, do not retry without investigating — contact Kindo support for guidance.

Redis and Conversation Streaming

Conversations Hang or Produce No Output

If users send messages but never see a streaming response, Redis may be running in sharded cluster mode. Kindo currently requires a non-sharded Redis deployment for conversation streaming.

Check your Redis mode:

Terminal window
redis-cli INFO server | grep redis_mode
  • redis_mode:standalone or redis_mode:sentinel — compatible
  • redis_mode:cluster — not currently compatible; reconfigure to standalone or sentinel

For full Redis configuration guidance, see Infrastructure Requirements — Redis.

Authentication and SSO

SSO Login Fails

SymptomSolution
Redirect loop after loginVerify AUTH_BASE_URL matches the actual domain users access
SAML assertion errorCheck that the IdP’s ACS URL matches SSOReady’s AUTH_BASE_URL
User not found after SSOVerify the SAML attribute mapping includes email and organization ID

See the SSO Setup Guide for complete configuration details.

Integrations

Nango Connection Failures

  1. Verify NANGO_URL is reachable from both the API and task-worker-ts services.
  2. Check that NANGO_SECRET_KEY is correct and not expired.
  3. For OAuth integrations, verify the callback URL is <NANGO_URL>/oauth/callback.

MCP Server Issues

  1. Verify the MCP server pod is running: kubectl get pods -n mcp
  2. Check that mcpServerUrl in the IntegrationConfig table is correct.
  3. Test connectivity from the API pod: curl http://mcp-<service>.mcp:80/health

Observability

No Traces or Metrics

SymptomCauseSolution
No traces in Jaeger/TempoOTEL Collector not receiving dataVerify OTEL_EXPORTER_OTLP_ENDPOINT points to the collector
No metrics in GrafanaMetrics exporter disabledSet OTEL_METRICS_EXPORTER=otlp
Disconnected tracesContext not propagated between servicesEnsure W3C traceparent headers pass through your ingress

See the Observability Guide for complete configuration details.

Common Environment Variable Issues

VariableCommon MistakeFix
OTEL_SDK_DISABLEDSet to true accidentallyRemove or set to false
NANGO_URLMissing protocol (nango.example.com instead of https://nango.example.com)Include https://
NEXT_PUBLIC_*Changed after buildNext.js public env vars are baked in at build time — rebuild after changing

Getting Help

If you cannot resolve an issue:

  1. Collect service logs and pod describe output.
  2. Note the Kindo version (Helm chart version) and Kubernetes version.
  3. Contact Kindo support with the collected information.

Next Steps