Service Won't Start
Service Won’t Start
Section titled “Service Won’t Start”Observable Symptoms
Section titled “Observable Symptoms”- Docker container exits immediately after starting
- Service crashes during startup before accepting requests
- Container status shows “Exited” or “Restarting”
- Health check endpoints never become available
- Logs show initialization errors
Quick Fix
Section titled “Quick Fix”# Check container statusdocker ps -a | grep my-service
# View startup logsdocker logs my-service
# Check recent errorsdocker logs my-service 2>&1 | grep -i "error\|fatal\|exception"
# Restart with fresh statedocker compose down my-servicedocker compose up my-serviceCommon Causes (Ordered by Frequency)
Section titled “Common Causes (Ordered by Frequency)”1. Missing or Invalid Environment Variables
Section titled “1. Missing or Invalid Environment Variables”Frequency: Very Common (30% of cases)
Symptoms:
- Service exits with “Configuration error”
- Logs show missing required variables
- Connection failures to dependencies
Diagnostic Steps:
# Check environment variables in containerdocker compose exec my-service env | grep -E "RABBITMQ|POSTGRES|REDIS|JWT"
# View docker-compose configurationdocker compose config | grep -A 10 "my-service"
# Check .env filecat .env | grep -E "RABBITMQ|POSTGRES|DATABASE"Common Missing Variables:
# Required for most services:RABBITMQ_URL=amqp://admin:admin123@rabbitmq:5672DATABASE_URL=postgresql://postgres:postgres@postgres:5432/platformSERVICE_NAME=my-serviceSERVICE_VERSION=1.0.0
# Required for API Gateway:JWT_SECRET=your-secret-key-change-in-production# ORJWKS_URI=https://your-auth-provider.com/.well-known/jwks.json
# Required for services with Redis:REDIS_URL=redis://redis:6379Solution:
Add missing variables to docker-compose.yml:
services: my-service: environment: - RABBITMQ_URL=amqp://admin:admin123@rabbitmq:5672 - DATABASE_URL=postgresql://postgres:postgres@postgres:5432/platform - REDIS_URL=redis://redis:6379 - SERVICE_NAME=my-service - SERVICE_VERSION=1.0.0 - LOG_LEVEL=infoOr use .env file:
RABBITMQ_URL=amqp://admin:admin123@rabbitmq:5672DATABASE_URL=postgresql://postgres:postgres@postgres:5432/platformREDIS_URL=redis://redis:6379Prevention:
- Use environment variable validation on startup
- Document required variables in README
- Provide example
.env.examplefile
2. Cannot Connect to RabbitMQ
Section titled “2. Cannot Connect to RabbitMQ”Frequency: Very Common (25% of cases)
Symptoms:
- Logs show: “Failed to connect to RabbitMQ”
- Service retries connection and fails
- Container exits after connection timeout
Diagnostic Steps:
# Check RabbitMQ is runningdocker ps | grep rabbitmq
# Test RabbitMQ connectivity from service containerdocker compose exec my-service nc -zv rabbitmq 5672
# Check RabbitMQ logsdocker logs rabbitmq 2>&1 | tail -50
# Test RabbitMQ management interfacecurl http://localhost:15672/api/overviewCommon Issues:
A. RabbitMQ Not Started:
# Start RabbitMQdocker compose up -d rabbitmq
# Wait for RabbitMQ to be readydocker logs rabbitmq 2>&1 | grep "Server startup complete"B. Wrong Connection URL:
# ❌ WRONGRABBITMQ_URL=amqp://localhost:5672 # localhost won't work in Docker
# ✓ CORRECTRABBITMQ_URL=amqp://admin:admin123@rabbitmq:5672 # Use service nameC. Service Starts Before RabbitMQ Ready:
Add health check dependency:
services: my-service: depends_on: rabbitmq: condition: service_healthy
rabbitmq: healthcheck: test: rabbitmq-diagnostics -q ping interval: 10s timeout: 5s retries: 5Solution:
- Ensure RabbitMQ is running and healthy
- Use correct connection URL with service name
- Add
depends_onwith health check - Verify credentials match RabbitMQ configuration
Prevention:
- Use Docker Compose health checks
- Implement connection retry logic in BaseService
- Monitor RabbitMQ availability
3. Database Connection Failures
Section titled “3. Database Connection Failures”Frequency: Common (20% of cases)
Symptoms:
- Logs show: “Database connection failed”
- PostgreSQL timeout errors
- “relation does not exist” errors
Diagnostic Steps:
# Check PostgreSQL is runningdocker ps | grep postgres
# Test database connectivitydocker compose exec my-service nc -zv postgres 5432
# Connect to databasedocker compose exec postgres psql -U postgres -d platform
# Check database existsdocker compose exec postgres psql -U postgres -lCommon Issues:
A. Database Doesn’t Exist:
# Create databasedocker compose exec postgres psql -U postgres -c "CREATE DATABASE platform;"
# Or in docker-compose.ymlpostgres: environment: - POSTGRES_DB=platform # Auto-creates databaseB. Schema Not Initialized:
# Check if event store schema existsdocker compose exec postgres psql -U postgres -d platform \ -c "\dt" | grep events
# Initialize schema (run from service or migration)# This is typically done automatically by BaseService or event storeC. Wrong Connection String:
# ❌ WRONGDATABASE_URL=postgresql://postgres:postgres@localhost:5432/platform
# ✓ CORRECTDATABASE_URL=postgresql://postgres:postgres@postgres:5432/platformSolution:
- Ensure PostgreSQL is running
- Verify database exists
- Use correct connection string
- Initialize schema on first run
- Add health check dependency:
services: my-service: depends_on: postgres: condition: service_healthy
postgres: healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 10s timeout: 5s retries: 5Prevention:
- Auto-create database in docker-compose
- Run migrations on startup
- Use connection retry logic
4. Build Errors (Missing Dependencies)
Section titled “4. Build Errors (Missing Dependencies)”Frequency: Common (15% of cases)
Symptoms:
- Container fails during image build
- “Cannot find module” errors
- npm/pnpm install failures
Diagnostic Steps:
# Check build logsdocker compose build my-service 2>&1 | tee build.log
# Look for specific errorsgrep -i "error\|failed" build.log
# Check package.json dependenciescat package.json | jq '.dependencies'Common Issues:
A. Platform Package Version Mismatch:
{ "dependencies": { "@banyanai/platform-base-service": "^1.0.116", "@banyanai/platform-cqrs": "^1.0.110" // ❌ Different versions }}
// ✓ CORRECT: All same version{ "dependencies": { "@banyanai/platform-base-service": "^1.0.116", "@banyanai/platform-cqrs": "^1.0.116" }}B. Missing .js Extension in Imports:
// ❌ WRONGimport { BusinessError } from '../errors';
// ✓ CORRECTimport { BusinessError } from '../errors.js';C. node_modules Cached:
# Clear Docker build cachedocker compose build --no-cache my-service
# Or clear node_modules in Dockerfile# Add to .dockerignore:node_modules*/node_modules.npmSolution:
- Update all platform packages to same version
- Add
.jsextensions to all imports - Build with
--no-cacheflag - Ensure
.dockerignoreexcludesnode_modules
node_modules*/node_modulesdist*/dist.npm.git.envPrevention:
- Use exact versions for platform packages
- Run linter to check import extensions
- Use multi-stage Docker builds
5. Port Already in Use
Section titled “5. Port Already in Use”Frequency: Occasional (5% of cases)
Symptoms:
- Error: “Port 3000 is already in use”
- Service can’t bind to port
- Container exits with port conflict
Diagnostic Steps:
# Check what's using the portlsof -i :3000netstat -tuln | grep 3000
# Check Docker port bindingsdocker ps --format "table {{.Names}}\t{{.Ports}}"
# Check for duplicate service instancesdocker ps -a | grep my-serviceSolution:
A. Stop Conflicting Process:
# Find process IDlsof -i :3000
# Kill processkill -9 <PID>B. Change Service Port:
services: my-service: ports: - "3001:3000" # Map host port 3001 to container port 3000C. Remove Duplicate Containers:
# Stop all instancesdocker compose down my-service
# Remove stopped containersdocker container prune
# Start freshdocker compose up my-servicePrevention:
- Use unique ports per service
- Stop services properly with
docker compose down - Use Docker Compose for orchestration
6. TypeScript Compilation Errors
Section titled “6. TypeScript Compilation Errors”Frequency: Occasional (3% of cases)
Symptoms:
- Build fails with TypeScript errors
- Module resolution failures
- Type checking errors
Diagnostic Steps:
# Run TypeScript compiler locallypnpm run build
# Check for errorspnpm run type-check
# View detailed errorsnpx tsc --noEmit --prettyCommon Issues:
A. Missing tsconfig.json Settings:
{ "compilerOptions": { "experimentalDecorators": true, // REQUIRED "emitDecoratorMetadata": true, // REQUIRED "target": "ES2022", "module": "Node16", "moduleResolution": "Node16", "esModuleInterop": true, "strict": true }}B. Type-Only Imports for Decorators:
// ❌ WRONG: Type-only importimport type { CreateUserCommand } from '../contracts/commands.js';@CommandHandlerDecorator(CreateUserCommand) // Error!
// ✓ CORRECT: Value importimport { CreateUserCommand } from '../contracts/commands.js';@CommandHandlerDecorator(CreateUserCommand)Solution:
- Fix TypeScript configuration
- Resolve type errors
- Use value imports for decorators
- Rebuild service
# Clean buildrm -rf dist/pnpm run build
# Rebuild Docker imagedocker compose build --no-cache my-servicePrevention:
- Use provided tsconfig.json template
- Run type checking in CI/CD
- Enable strict mode
7. Handler Discovery Errors
Section titled “7. Handler Discovery Errors”Frequency: Occasional (2% of cases)
Symptoms:
- Service starts but no handlers discovered
- Contract broadcasting fails
- Handler registration errors
See: Handlers Not Discovered for detailed troubleshooting.
Quick Check:
# Check handler discovery logsdocker logs my-service 2>&1 | grep "Handler discovery"
# Should show:# Handler discovery completed { commandHandlers: N, queryHandlers: M, ... }Advanced Diagnostics
Section titled “Advanced Diagnostics”Enable Debug Logging
Section titled “Enable Debug Logging”services: my-service: environment: - LOG_LEVEL=debugView detailed startup logs:
docker compose up my-service
# Look for initialization steps:# [DEBUG] Connecting to RabbitMQ...# [DEBUG] Connection established# [DEBUG] Scanning handlers...# [DEBUG] Handler discovery completed# [DEBUG] Service readyInteractive Container Debugging
Section titled “Interactive Container Debugging”# Start container with shelldocker compose run --rm my-service /bin/bash
# Test connections manuallync -zv rabbitmq 5672nc -zv postgres 5432nc -zv redis 6379
# Check environmentenv | grep -E "RABBITMQ|DATABASE|REDIS"
# Try starting service manuallynode dist/index.jsCheck Dependencies Startup Order
Section titled “Check Dependencies Startup Order”# View dependency graphdocker compose config | grep -A 5 "depends_on"
# Ensure proper order:# 1. Infrastructure (RabbitMQ, PostgreSQL, Redis)# 2. Platform services (service-discovery, api-gateway)# 3. Business servicesReview Health Checks
Section titled “Review Health Checks”# Check health statusdocker compose ps
# View health check logsdocker inspect my-service | jq '.[0].State.Health'
# Test health endpoint manuallycurl http://localhost:3000/healthVerification Steps
Section titled “Verification Steps”After fixing the issue:
1. Service Starts Successfully
Section titled “1. Service Starts Successfully”# Check container status (should be "Up")docker ps | grep my-service
# View startup logs (no errors)docker logs my-service | tail -50
# Should see:# Service 'my-service' started successfully on port 30002. Health Check Passes
Section titled “2. Health Check Passes”# Test health endpointcurl http://localhost:3000/health
# Should return:# {"status":"healthy","service":"my-service","version":"1.0.0"}3. Handlers Discovered
Section titled “3. Handlers Discovered”# Check handler discoverydocker logs my-service | grep "Handler discovery"
# Should show handlers found4. Service Registered
Section titled “4. Service Registered”# Check service discoverycurl http://localhost:3001/api/services | jq '.services[] | select(.name=="my-service")'
# Should show service registeredCommon Error Messages
Section titled “Common Error Messages””Cannot find module ‘@banyanai/platform-*’”
Section titled “”Cannot find module ‘@banyanai/platform-*’””Cause: Missing package installation or wrong version
Solution:
# In docker-compose.yml or DockerfileRUN pnpm install --frozen-lockfile
# Verify package.json has correct versions“Cannot use import statement outside a module”
Section titled ““Cannot use import statement outside a module””Cause: Missing "type": "module" in package.json
Solution:
{ "type": "module", "main": "./dist/index.js"}“ECONNREFUSED” or “getaddrinfo ENOTFOUND”
Section titled ““ECONNREFUSED” or “getaddrinfo ENOTFOUND””Cause: Service trying to connect to unavailable dependency
Solution:
- Check dependency is running
- Use correct hostname (service name in Docker)
- Add health check dependency
Startup Checklist
Section titled “Startup Checklist”Before deploying, verify:
- All required environment variables set
- RabbitMQ running and accessible
- PostgreSQL running with database created
- Redis running (if needed)
- All dependencies started before service
- Platform packages at same version
- TypeScript compiles without errors
- Handlers in correct directories with decorators
- Docker image builds successfully
- No port conflicts
- Health check endpoint configured
Related Documentation
Section titled “Related Documentation”- BaseService Startup - Service initialization
- Message Bus Issues - RabbitMQ troubleshooting
- Service Discovery Issues - Registration problems
- Log Analysis - Reading service logs
- Error Catalog - Specific error codes
Summary
Section titled “Summary”Most service startup failures are caused by:
- Missing environment variables - Add all required config
- Dependency unavailability - Ensure RabbitMQ/PostgreSQL running
- Build errors - Fix TypeScript compilation and dependencies
- Port conflicts - Use unique ports or stop conflicting services
Use docker logs to identify the specific error, then follow the diagnostic steps for that cause.