## Overview Implement a comprehensive environment variable and configuration management system for all services across development, staging, and production environments following vendor-neutral patterns with GitHub Actions. ## Epic Goal Design and implement a centralized configuration management strategy that: - Provides type-safe configuration schemas - Secures sensitive credentials in GCP Secret Manager - Enables environment-specific configurations - Validates configuration at startup - Supports local development workflows - Documents configuration requirements ## Business Value - **Security**: Centralized secret management with encryption at rest - **Reliability**: Type-safe configuration prevents runtime errors - **Developer Experience**: Clear documentation and .env templates - **Maintainability**: Single source of truth for configuration - **Compliance**: Audit trail for configuration changes ## Components ### 1. Secret Manager Infrastructure (INFRA) - GCP Secret Manager setup via Terraform - Secret creation for DATABASE_URL, API keys, OAuth credentials - IAM permissions for service accounts - Secret rotation policies ### 2. Configuration Schema (INFRA) - Zod schemas for type-safe validation - Shared configuration (NODE_ENV, DATABASE_URL, REDIS_URL, NATS_URL) - Service-specific configuration schemas - Environment variable naming conventions ### 3. Configuration Loading (INFRA) - Runtime configuration validation - Environment-specific overrides - Graceful error handling for missing config - Configuration drift detection ### 4. Deployment Scripts (INFRA) - GitHub Actions workflows for secret deployment - Environment promotion scripts (dev → staging → production) - Configuration validation in CI/CD - Rollback procedures ### 5. Documentation (DOCS) - Configuration management guide - Environment variable reference - Local development setup (.env.example files) - Troubleshooting guide ## Required Environment Variables ### Shared Configuration (All Services) - `NODE_ENV`: Environment name (production, staging, development) - `LOG_LEVEL`: Logging level (debug, info, warn, error) - `DATABASE_URL`: PostgreSQL connection string (Secret Manager) - `REDIS_URL`: Redis connection string (Secret Manager) - `NATS_URL`: NATS connection string (Secret Manager) - `SENTRY_DSN`: Error tracking endpoint (optional) - `OTEL_EXPORTER_OTLP_ENDPOINT`: Telemetry endpoint (optional) ### API Gateway Specific - `JWT_SECRET`: JWT signing secret (Secret Manager) - `JWT_EXPIRATION`: Token expiration (24h) - `CORS_ORIGINS`: Allowed origins (comma-separated) - `RATE_LIMIT_MAX`: Max requests per window (100) - `RATE_LIMIT_WINDOW_MS`: Rate limit window (900000) - `API_VERSION`: API version (v1) ### Agent Service Specific - `OPENAI_API_KEY`: OpenAI API key (Secret Manager) - `ANTHROPIC_API_KEY`: Anthropic API key (Secret Manager) - `GOOGLE_AI_API_KEY`: Google AI API key (Secret Manager) - `MAX_CONCURRENT_AGENTS`: Max concurrent agent sessions (10) - `AGENT_TIMEOUT_MS`: Agent timeout (300000) ### Sync Service Specific - `SYNC_INTERVAL_MS`: Synchronization interval (60000) - `BATCH_SIZE`: Batch processing size (100) - `MAX_RETRIES`: Max retry attempts (3) ### Integration Service Specific - `GOOGLE_OAUTH_CLIENT_ID`: Google OAuth client ID (Secret Manager) - `GOOGLE_OAUTH_CLIENT_SECRET`: Google OAuth secret (Secret Manager) - `MS365_CLIENT_ID`: Microsoft 365 client ID (Secret Manager) - `MS365_CLIENT_SECRET`: Microsoft 365 secret (Secret Manager) - `WEBHOOK_SECRET`: Webhook validation secret (Secret Manager) ## Child Tasks This epic will be broken down into child tasks following the TDD workflow: ### Infrastructure Phase - [ ] [INFRA-SPEC] Secret Manager - Specification - [ ] [INFRA-TEST] Secret Manager - Test Suite - [ ] [INFRA-IMPL] Secret Manager - Implementation ### Schema Phase - [ ] [INFRA-SPEC] Configuration Schema - Specification - [ ] [INFRA-TEST] Configuration Schema - Test Suite - [ ] [INFRA-IMPL] Configuration Schema - Implementation ### Deployment Phase - [ ] [INFRA-SPEC] Configuration Deployment - Specification - [ ] [INFRA-TEST] Configuration Deployment - Test Suite - [ ] [INFRA-IMPL] Configuration Deployment - Implementation ### Documentation Phase - [ ] [DOCS] Configuration Management - Documentation ## Success Criteria - ✅ All secrets stored in GCP Secret Manager with encryption - ✅ Type-safe configuration schemas implemented with Zod - ✅ Configuration validation runs at service startup - ✅ Environment-specific configurations for dev/staging/production - ✅ .env.example files for all services - ✅ GitHub Actions workflows for secret deployment - ✅ Configuration documentation complete - ✅ All tests passing (unit, integration, infrastructure) - ✅ Zero configuration-related runtime errors in staging ## Dependencies - Requires: Issue #77 (Secret Manager infrastructure) - Requires: Issue #79 (Cloud SQL for DATABASE_URL) - Requires: Issue #80 (Redis for REDIS_URL) - Requires: Issue #81 (NATS for NATS_URL) - Blocks: All service implementations (need config to run) ## Estimated Effort - Total: 12-15 hours - SPEC: 2-3 hours - TEST: 4-5 hours - IMPL: 5-6 hours - DOCS: 1-2 hours ## References - Twelve-Factor App Config: https://12factor.net/config - Zod Documentation: https://zod.dev - GCP Secret Manager: https://cloud.google.com/secret-manager/docs ## Notes - Using vendor-neutral approach with GitHub Actions (not Cloud Run) - All secrets must be in Secret Manager (never in code/env files) - Configuration changes require PR review - Breaking config changes need migration guide ## Activity Checklist ### INFRA - [ ] INFRA-SPEC: #188 - [ ] INFRA-SPEC: #185 - [ ] INFRA-SPEC: #182 - [ ] INFRA-TEST: #189 - [ ] INFRA-TEST: #186 - [ ] INFRA-TEST: #183 - [ ] INFRA-IMPL: #190 - [ ] INFRA-IMPL: #187 - [ ] INFRA-IMPL: #184