API Calls Failing
API Calls Failing
Section titled “API Calls Failing”Observable Symptoms
Section titled “Observable Symptoms”- HTTP requests return error responses (400, 401, 403, 404, 500)
- GraphQL queries/mutations fail
- CORS errors in browser console
- Timeout errors
- Connection refused errors
Quick Fix
Section titled “Quick Fix”# Check API Gateway statusdocker ps | grep api-gatewaydocker logs api-gateway 2>&1 | tail -50
# Test API Gateway healthcurl http://localhost:3000/health
# Check for common errorsdocker logs api-gateway 2>&1 | grep -i "error\|fail"
# Get correlation ID from responsecurl -i http://localhost:3000/api/your-endpoint# Look for X-Correlation-Id headerCommon Causes (Ordered by Frequency)
Section titled “Common Causes (Ordered by Frequency)”1. Authentication Failures (401 Unauthorized)
Section titled “1. Authentication Failures (401 Unauthorized)”Frequency: Very Common (35% of cases)
Symptoms:
- HTTP 401 response
- Error message: “Unauthorized” or “Authentication required”
- Valid JWT but still rejected
- Development headers not working
Diagnostic Steps:
# Check authentication modedocker logs api-gateway 2>&1 | grep "JWTAuthenticationEngine\|DEVELOPMENT_AUTH_ENABLED"
# Test with dev headerscurl -H "X-Dev-User-Id: test-user" \ -H "X-Dev-Permissions: *" \ http://localhost:3000/api/endpoint
# Test with JWTcurl -H "Authorization: Bearer YOUR_JWT_TOKEN" \ http://localhost:3000/api/endpoint
# Decode JWT to check claimsecho "YOUR_JWT_TOKEN" | cut -d. -f2 | base64 -d | jqCommon Issues:
A. Development Mode Not Enabled:
# ❌ MISSING: Development auth not enabledapi-gateway: environment: - JWT_SECRET=secret
# ✓ CORRECT: Development auth enabledapi-gateway: environment: - DEVELOPMENT_AUTH_ENABLED=true - JWT_SECRET=secretB. Invalid JWT Token:
# Check token expirationecho "YOUR_JWT" | cut -d. -f2 | base64 -d | jq '.exp'
# Compare to current timedate +%s
# If exp < current time, token is expiredC. Missing JWT sub Claim:
// ❌ WRONG: No subject claim{ "email": "user@example.com", "permissions": ["users:read"]}
// ✓ CORRECT: Has sub claim{ "sub": "user-123", "email": "user@example.com", "permissions": ["users:read"]}D. Algorithm Mismatch (HS256 vs RS256):
# Check JWT headerecho "YOUR_JWT" | cut -d. -f1 | base64 -d | jq
# If alg is RS256, need JWKS_URI# If alg is HS256, need JWT_SECRET
# ❌ WRONG: RS256 token with JWT_SECRETapi-gateway: environment: - JWT_SECRET=secret # Won't work with RS256
# ✓ CORRECT: RS256 with JWKSapi-gateway: environment: - JWKS_URI=https://auth-provider.com/.well-known/jwks.jsonSolution:
See Authentication Errors for detailed authentication troubleshooting.
Quick fixes:
- Enable development mode for local testing
- Verify JWT has
subclaim - Check token expiration
- Match algorithm (HS256 vs RS256) with config
Prevention:
- Use consistent auth configuration
- Monitor token expiration
- Validate JWT structure in auth service
2. Permission Denied (403 Forbidden)
Section titled “2. Permission Denied (403 Forbidden)”Frequency: Very Common (25% of cases)
Symptoms:
- HTTP 403 response
- Error: “Access denied” or “Insufficient permissions”
- User authenticated but operation rejected
- Permission requirements not met
Diagnostic Steps:
# Check required permissionscurl http://localhost:3001/api/services/SERVICE_NAME/contracts | jq
# Check user permissions in JWTecho "YOUR_JWT" | cut -d. -f2 | base64 -d | jq '.permissions'
# Test with wildcard permissionscurl -H "X-Dev-User-Id: test" \ -H "X-Dev-Permissions: *" \ http://localhost:3000/api/endpoint
# Check API Gateway logsdocker logs api-gateway 2>&1 | grep "AUTHORIZATION_ERROR"Common Issues:
A. Missing Permissions in JWT:
// ❌ WRONG: User lacks required permission{ "sub": "user-123", "permissions": ["users:read"] // Missing users:create}
// Contract requires:@RequiresPermissions(['users:create'])
// ✓ CORRECT: User has required permission{ "sub": "user-123", "permissions": ["users:read", "users:create"]}B. Permission Format Mismatch:
# ❌ WRONG formats:"users-create" # Dash instead of colon"USERS:CREATE" # Uppercase"user:create" # Wrong resource name (singular vs plural)
# ✓ CORRECT format:"users:create" # lowercase, colon separator, plural resourceC. Development Headers Wrong:
# ❌ WRONGcurl -H "X-Dev-Permissions: users:create users:read" # Space-separated
# ✓ CORRECTcurl -H "X-Dev-Permissions: users:create,users:read" # Comma-separated# ORcurl -H "X-Dev-Permissions: *" # Wildcard for all permissionsSolution:
- Check contract required permissions
- Ensure JWT/headers include those permissions
- Use exact permission format (lowercase, colon)
- Grant permissions to user/role in auth service
Prevention:
- Document required permissions for each operation
- Use permission constants to avoid typos
- Implement permission management UI
3. Not Found (404)
Section titled “3. Not Found (404)”Frequency: Common (20% of cases)
Symptoms:
- HTTP 404 response
- Error: “Not found” or “Handler not found”
- Route not registered
- Service not discovered
Diagnostic Steps:
# Check service registrationcurl http://localhost:3001/api/services | jq '.services[].name'
# Check API Gateway routesdocker logs api-gateway 2>&1 | grep "route\|endpoint"
# Check handler discoverydocker logs TARGET_SERVICE 2>&1 | grep "Handler discovery"
# Test correct endpoint format# Commands/Mutations: POST /api/operation-name# Queries: GET /api/operation-name?param=valueCommon Issues:
A. Service Not Registered:
# Service not in service discoverycurl http://localhost:3001/api/services | jq '.services[] | select(.name=="my-service")'
# Returns empty? Service not started or not registereddocker ps | grep my-servicedocker logs my-service | grep "registered"B. Wrong Endpoint Path:
# ❌ WRONG paths:POST /api/CreateUser # Wrong case (PascalCase)GET /api/users # Wrong format (REST-style)POST /commands/create-user # Wrong prefix
# ✓ CORRECT paths:POST /api/create-user # kebab-case for commandsGET /api/get-user?userId=123 # kebab-case for queriesC. Handler Not Discovered:
# Check handler discovery countdocker logs my-service 2>&1 | grep "Handler discovery"
# Should show handlers found# If totalHandlers: 0, see handlers-not-discovered.mdD. Wrong HTTP Method:
# ❌ WRONGGET /api/create-user # Commands need POST
# ✓ CORRECTPOST /api/create-user # Commands use POSTGET /api/get-user # Queries use GETSolution:
- Ensure service is running and registered
- Verify handler discovered (see Handlers Not Discovered)
- Use correct endpoint path format
- Use correct HTTP method (POST for commands, GET for queries)
Prevention:
- Monitor service registration
- Use API client generator for type-safe calls
- Document endpoint conventions
4. Validation Errors (400 Bad Request)
Section titled “4. Validation Errors (400 Bad Request)”Frequency: Common (10% of cases)
Symptoms:
- HTTP 400 response
- Error: “Validation failed” with field details
- Missing required fields
- Invalid field values
Diagnostic Steps:
# Check error response for validation detailscurl -X POST http://localhost:3000/api/create-user \ -H "Content-Type: application/json" \ -d '{"email":"invalid"}' | jq
# Response shows which fields failed:# {# "error": "Validation failed",# "validationErrors": [# {"field": "email", "message": "Invalid email format"},# {"field": "password", "message": "Required field missing"}# ]# }Common Issues:
A. Missing Required Fields:
// ❌ WRONG: Missing required fields{ "email": "user@example.com" // Missing: password, name, etc.}
// ✓ CORRECT: All required fields{ "email": "user@example.com", "password": "secure-password", "name": "John Doe"}B. Invalid Field Format:
// ❌ WRONG: Invalid formats{ "email": "not-an-email", // Invalid email "age": "twenty-five", // Should be number "startDate": "2024-13-45" // Invalid date}
// ✓ CORRECT: Valid formats{ "email": "user@example.com", "age": 25, "startDate": "2024-01-15T00:00:00Z"}C. Type Mismatch:
// ❌ WRONG: String where number expected{ "userId": "123" // Should be number or UUID depending on contract}
// ✓ CORRECT: Proper types{ "userId": 123 // or "user-uuid-123" depending on schema}Solution:
- Review validation error details
- Ensure all required fields present
- Use correct field types and formats
- Validate input client-side before sending
Prevention:
- Use TypeScript types for API calls
- Generate client libraries from contracts
- Implement client-side validation
5. Internal Server Error (500)
Section titled “5. Internal Server Error (500)”Frequency: Occasional (5% of cases)
Symptoms:
- HTTP 500 response
- Error: “Internal Server Error”
- Correlation ID provided
- Handler execution failure
Diagnostic Steps:
# Get correlation ID from responsecurl -i http://localhost:3000/api/endpoint
# Search logs with correlation IDCORRELATION_ID="abc-123-def-456"docker logs api-gateway 2>&1 | grep "$CORRELATION_ID"docker logs target-service 2>&1 | grep "$CORRELATION_ID"
# Check Jaeger trace# Open http://localhost:16686# Search for correlation IDCommon Causes:
A. Handler Throws Unhandled Exception:
// Handler code with error@CommandHandler(CreateUserCommand)export class CreateUserHandler { async handle(command: CreateUserCommand) { // Throws error const result = await this.database.save(user); return result.data; // ← result might be null, throws TypeError }}B. Database Connection Lost:
# Check database connectivitydocker ps | grep postgresdocker logs postgres | tail -50
# Test connection from servicedocker compose exec my-service nc -zv postgres 5432C. Message Bus Failure:
# Check RabbitMQdocker ps | grep rabbitmqdocker logs rabbitmq | tail -50
# Check message bus client connectiondocker logs my-service 2>&1 | grep "RabbitMQ\|message bus"Solution:
- Get correlation ID from error response
- Search service logs for correlation ID
- Find stack trace and root cause
- Fix handler logic or infrastructure issue
- Test fix with same request
Prevention:
- Add proper error handling in handlers
- Use correlation IDs for tracing
- Monitor service health and dependencies
6. CORS Errors
Section titled “6. CORS Errors”Frequency: Occasional (3% of cases)
Symptoms:
- Browser console: “blocked by CORS policy”
- Preflight OPTIONS request fails
- Cross-origin request blocked
Diagnostic Steps:
# Check CORS headers in responsecurl -i -X OPTIONS http://localhost:3000/api/endpoint \ -H "Origin: http://localhost:5173" \ -H "Access-Control-Request-Method: POST"
# Should see:# Access-Control-Allow-Origin: *# Access-Control-Allow-Methods: GET, POST, OPTIONS# Access-Control-Allow-Headers: Content-Type, AuthorizationCommon Issues:
A. API Gateway Version Too Old:
Early versions had CORS bugs. Ensure version 1.0.115+:
# Check versiondocker logs api-gateway 2>&1 | grep "version"
# Update if neededdocker pull ghcr.io/your-org/api-gateway:latestdocker compose up -d api-gatewayB. Custom Headers Not Allowed:
// ❌ WRONG: Custom header not in allowed listfetch('http://localhost:3000/api/endpoint', { headers: { 'X-Custom-Header': 'value' // Not allowed by default }});
// ✓ CORRECT: Use allowed headersfetch('http://localhost:3000/api/endpoint', { headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer token', 'X-Dev-User-Id': 'test' // Dev headers allowed }});Solution:
- Update API Gateway to 1.0.115+
- Use standard headers
- Check CORS configuration in API Gateway
Prevention:
- Keep API Gateway updated
- Use standard HTTP headers when possible
- Test with browser dev tools
7. Request Timeout
Section titled “7. Request Timeout”Frequency: Rare (2% of cases)
Symptoms:
- Request never completes
- Gateway timeout (504)
- Client timeout error
Diagnostic Steps:
# Check if request reached serviceCORRELATION_ID="from-response"docker logs target-service 2>&1 | grep "$CORRELATION_ID"
# Check Jaeger for slow spans# Open http://localhost:16686# Find trace and identify slow operations
# Check service healthcurl http://localhost:3000/healthCommon Causes:
A. Handler Too Slow:
Handler takes longer than timeout (default 30s):
// Slow handler@CommandHandler(ProcessLargeFileCommand)export class ProcessLargeFileHandler { async handle(command: ProcessLargeFileCommand) { // Takes 60+ seconds await this.processFile(command.fileId); }}B. Database Query Slow:
-- Check slow queriesSELECT query, mean_exec_time, callsFROM pg_stat_statementsORDER BY mean_exec_time DESCLIMIT 10;C. External API Delay:
// Waiting on external serviceawait externalApi.call(); // Takes 45 secondsSolution:
- Identify slow operation in Jaeger trace
- Optimize slow database queries (add indexes)
- Use async processing for long operations
- Increase timeout if operation legitimately slow
- Consider saga pattern for multi-step operations
Prevention:
- Monitor request duration
- Set query timeouts
- Use background jobs for long operations
Debugging Workflow
Section titled “Debugging Workflow”For any API failure, follow this workflow:
1. Check HTTP Status Code
Section titled “1. Check HTTP Status Code”- 401 → Authentication issue (see Authentication Errors)
- 403 → Permission issue (check required permissions)
- 404 → Route/handler not found (verify service registered)
- 400 → Validation error (check request payload)
- 500 → Internal error (use correlation ID to trace)
- 504 → Timeout (check for slow operations)
2. Get Correlation ID
Section titled “2. Get Correlation ID”# From response headercurl -i http://localhost:3000/api/endpoint | grep X-Correlation-Id
# From response bodycurl http://localhost:3000/api/endpoint | jq '.correlationId'3. Search Logs
Section titled “3. Search Logs”CORRELATION_ID="abc-123"
# API Gateway logsdocker logs api-gateway 2>&1 | grep "$CORRELATION_ID"
# Target service logsdocker logs my-service 2>&1 | grep "$CORRELATION_ID"4. Check Jaeger Trace
Section titled “4. Check Jaeger Trace”1. Open http://localhost:166862. Service: Select target service3. Tags: correlation.id="abc-123"4. Find Traces5. Click trace to see timeline6. Identify error span5. Fix and Verify
Section titled “5. Fix and Verify”# After fix, test with same requestcurl -X POST http://localhost:3000/api/endpoint \ -H "Content-Type: application/json" \ -d '{"test":"data"}'
# Should succeed with 200 OKCommon Error Messages
Section titled “Common Error Messages””Cannot read property ‘X’ of undefined”
Section titled “”Cannot read property ‘X’ of undefined””Cause: Handler logic error - accessing undefined object
Solution: Add null checks in handler code
”Connection refused”
Section titled “”Connection refused””Cause: API Gateway not running or wrong port
Solution: Check gateway status and port mapping
”Invalid token signature”
Section titled “”Invalid token signature””Cause: JWT signed with different secret than gateway expects
Solution: Ensure JWT_SECRET matches between auth service and gateway
Verification Steps
Section titled “Verification Steps”After fixing issue:
1. Request Succeeds
Section titled “1. Request Succeeds”# Test endpointcurl -X POST http://localhost:3000/api/create-user \ -H "Content-Type: application/json" \ -H "X-Dev-User-Id: test" \ -H "X-Dev-Permissions: users:create" \ -d '{"email":"test@example.com","password":"password123"}'
# Should return 200 OK with result2. Check Response
Section titled “2. Check Response”// Success response{ "userId": "user-123", "email": "test@example.com"}
// With correlation ID in headersX-Correlation-Id: abc-123-def-4563. Verify in Jaeger
Section titled “3. Verify in Jaeger”All spans complete without errors
Related Documentation
Section titled “Related Documentation”- API Gateway Issues - Gateway-specific troubleshooting
- Authentication Errors - Detailed auth troubleshooting
- Correlation ID Tracking - Using correlation IDs
- Jaeger Tracing - Distributed tracing
- Error Catalog - Specific error codes
Summary
Section titled “Summary”API call failures usually fall into these categories:
- Authentication (401) - Enable dev mode or fix JWT
- Authorization (403) - Grant required permissions
- Not Found (404) - Verify service registered and handlers discovered
- Validation (400) - Fix request payload
- Internal Error (500) - Use correlation ID to trace root cause
Always capture the correlation ID and use it to trace the request through logs and Jaeger.