This guide walks through deploying CarePath to AWS EKS with MongoDB Atlas.
- Prerequisites
- Environment Setup
- Configuration
- Step-by-Step Deployment
- Finding Service URLs
- Frontend Deployment
- Verification
- Operations & Contingencies
- Troubleshooting
- AWS CLI configured with SSO profile
- Terraform >= 1.0
- Docker
- kubectl
- Node.js 18+ and npm (for frontend)
- MongoDB Atlas account with organization ID
Copy .env.example to .env and configure:
cp .env.example .envKey variables for deployment:
# AWS Configuration
DEPLOY_AWS_REGION=us-east-1
DEPLOY_AWS_PROFILE=your-sso-profile
DEPLOY_AWS_ACCOUNT_ID=123456789012
# Terraform State (can reuse existing backend from other projects)
DEPLOY_TF_STATE_BUCKET_NAME=your-terraform-state-bucket
DEPLOY_TF_DYNAMO_DB_TABLE=your-terraform-state-locks
# MongoDB (your existing cluster connection string)
MONGODB_URI=mongodb+srv://user:password@your-cluster.mongodb.net/?retryWrites=true&w=majority
MONGODB_DB_NAME=carepathCreate the backend configuration file:
cp infra/terraform/envs/demo/backend.hcl.example infra/terraform/envs/demo/backend.hclEdit backend.hcl with your values:
bucket = "your-terraform-state-bucket" # e.g., genonaut-terraform-state
key = "carepath/demo/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "your-terraform-state-locks" # e.g., genonaut-terraform-state-locks
encrypt = trueNote: You can safely reuse an existing Terraform state backend from other projects. Each project uses a unique key
path, so state files don't interfere with each other.
Create the variables file:
cp infra/terraform/envs/demo/terraform.tfvars.example infra/terraform/envs/demo/terraform.tfvarsEdit terraform.tfvars with your MongoDB URI:
environment = "demo"
aws_region = "us-east-1"
# Use your existing MongoDB cluster
mongodb_uri = "mongodb+srv://user:password@your-cluster.mongodb.net/?retryWrites=true&w=majority"
mongodb_database_name = "carepath"Note: If you want Terraform to create a new MongoDB Atlas cluster instead, set create_mongodb_atlas = true and
provide the Atlas API keys. See terraform.tfvars.example for details.
EKS worker nodes can run as either On-Demand or Spot instances. This setting affects AWS quota usage and cost.
| Type | Description | When to Use |
|---|---|---|
ON_DEMAND |
Standard EC2 instances | Production, stable workloads |
SPOT |
Discounted instances (can be interrupted) | Development, demos, cost savings |
Why SPOT exists: New AWS accounts have restrictive quotas on On-Demand EC2 "Fleet Requests." If you hit quota limits during deployment, switching to SPOT instances uses a different quota pool and often resolves the issue. SPOT instances are also 60-90% cheaper.
Check current setting:
make ec2-config-statusSwitch to SPOT instances:
make ec2-config-set-as-spotSwitch to On-Demand (standard) instances:
make ec2-config-set-as-nodesNote: After changing this setting, you must delete any existing/failed node group and run make tf-apply for changes to take effect. See Troubleshooting for details.
By default in the demo environment, the db-api is exposed externally via a LoadBalancer so you can query it directly (useful for demos and frontend development). In production, you'd typically keep it internal-only.
| Setting | Service Type | Access |
|---|---|---|
expose_db_api = true |
LoadBalancer | Publicly accessible via ELB URL |
expose_db_api = false |
ClusterIP | Internal only (within cluster) |
Check current setting:
kubectl get svc db-api-service -n carepath-demo -o jsonpath='{.spec.type}'To change: Edit infra/terraform/envs/demo/variables.tf:
variable "expose_db_api" {
...
default = true # true = external, false = internal only
}Then run make tf-apply.
Security Note: When expose_db_api = true, anyone can query your database API. This is fine for demos but for production you should:
- Keep it internal (
expose_db_api = false) - Or add authentication/API keys
- Or use an API Gateway with auth
Infrastructure can be deployed to different AWS regions. This is useful if you hit quota limits in one region but have approved quotas in another.
| Region | Location | Notes |
|---|---|---|
us-east-1 |
N. Virginia | Default, most services available |
us-east-2 |
Ohio | Alternative if us-east-1 quotas are exhausted |
Check current region:
make region-statusSwitch to us-east-2 (Ohio):
make region-set-us-east-2Switch to us-east-1 (N. Virginia):
make region-set-us-east-1Important: Changing regions is a destructive operation. You must:
-
Destroy existing infrastructure first:
make tf-destroy
-
Run the region switch command (it will prompt for confirmation)
-
Deploy to the new region:
make tf-apply
Note: The Terraform state bucket (in backend.hcl) stays in us-east-1 regardless of where you deploy. S3 buckets are globally accessible, so you don't need to change it or run bootstrap commands again.
make aws-loginThis authenticates via AWS SSO and verifies your credentials.
make tf-initThis initializes Terraform with your backend configuration.
make tf-planReview the planned changes carefully. This will show you what resources will be created:
- VPC with public/private subnets
- EKS cluster with managed node group
- ECR repositories for Docker images
- MongoDB Atlas cluster
- Kubernetes namespace, deployments, and services
make tf-applyThis creates all the infrastructure. This may take 15-20 minutes (EKS cluster creation is slow).
Note: If you encounter errors during this step, see the Troubleshooting section for common issues and solutions.
After Terraform completes, configure kubectl to connect to your EKS cluster:
make k8s-config# Build both images
make docker-build-db-api
make docker-build-chat
# Push both to ECR
make docker-push-db-api
make docker-push-chat# Deploy both services
make deploy-allOr deploy individually:
make deploy-db-api
make deploy-chatAfter deployment, get the service URLs:
make k8s-get-urlsThis shows:
- Chat API URL: External LoadBalancer URL (publicly accessible)
- DB API URL: External LoadBalancer URL if
expose_db_api = true(default for demo), otherwise internal ClusterIP
Note: LoadBalancer URLs may take 2-3 minutes to provision after deployment.
You can also check all resources:
make k8s-statusThe CarePath frontend is a React app hosted on AWS S3 with CloudFront CDN. This provides a simple web UI for chatting with the CarePath AI and viewing chat history.
- Node.js 18+ and npm
- Backend APIs already deployed (db-api and chat-api)
- Terraform frontend infrastructure created (
make tf-apply)
NEW: As of 2024-11-24, the frontend build now automatically fetches API URLs from Terraform outputs. You no longer need to manually configure .env for deployments!
The .env file in frontend_chat/ is now only used for local development. For production deployments, make frontend-deploy automatically:
- Fetches current load balancer URLs from Terraform
- Builds the frontend with those URLs
- Deploys to S3
If you want to override this behavior for local development, create a .env file:
cd frontend_chat
cp .env.example .envEdit .env with your API URLs:
VITE_DB_API_URL=http://your-db-api-loadbalancer.elb.amazonaws.com
VITE_CHAT_API_URL=http://your-chat-api-loadbalancer.elb.amazonaws.com# Install dependencies (first time only)
make frontend-install
# Build and deploy to S3/CloudFront
# This automatically uses current Terraform outputs for API URLs
make frontend-deployThis will:
- Fetch current API load balancer URLs from Terraform outputs
- Build the React app for production with those URLs
- Sync the build output to S3
- Invalidate the CloudFront cache (if CloudFront enabled)
- Print the frontend URL
Note: When you run make deploy-chat, the frontend is automatically redeployed with updated API URLs. This ensures the frontend always points to the correct backend services.
To run the frontend locally:
make frontend-devThe dev server starts at http://localhost:5173 with hot reload.
| Command | Description |
|---|---|
make frontend-install |
Install npm dependencies |
make frontend-dev |
Run local dev server |
make frontend-build |
Build for production |
make frontend-deploy |
Build and deploy to S3/CloudFront |
make frontend-invalidate-cache |
Invalidate CloudFront cache |
The following Terraform variables control frontend deployment:
| Variable | Default | Description |
|---|---|---|
expose_frontend |
true |
Whether to create S3/CloudFront resources |
frontend_bucket_name |
carepath-demo-frontend |
S3 bucket name for static files |
make k8s-statusAll pods should show Running status with 1/1 ready.
# View db-api logs
make k8s-logs s=db-api
# View chat-api logs
make k8s-logs s=chat-apiGet the Chat API URL first:
make k8s-get-urlsThen test:
# Replace with your actual LoadBalancer URL
CHAT_URL="http://your-loadbalancer-url.elb.amazonaws.com"
# Test chat-api health
curl $CHAT_URL/health
# Test db-api health (via chat-api, since db-api is internal)
# The chat-api calls db-api internally, so if triage works, both are healthycurl -X POST $CHAT_URL/triage \
-H "Content-Type: application/json" \
-d '{"patient_mrn": "P000123", "query": "What are my current medications?"}'Expected response includes response, trace_id, conversation_id, etc.
The deployments have HPA (Horizontal Pod Autoscaler) configured, but you can manually scale:
Scale Up:
# Scale db-api to 3 replicas
make k8s-scale-up s=db-api r=3
# Scale chat-api to 3 replicas
make k8s-scale-up s=chat-api r=3Scale Down:
# Scale db-api to 1 replica
make k8s-scale-down s=db-api r=1
# Scale chat-api to 1 replica
make k8s-scale-down s=chat-api r=1Check current scale:
make k8s-statusWhen you make code changes and want to deploy:
Option A: Deploy a single service
# This builds, pushes, and deploys in one command
make deploy-db-api # or
make deploy-chatOption B: Deploy all services
make deploy-allOption C: Just restart pods (same image)
make k8s-restart-db-api
make k8s-restart-chatView rollout history:
make k8s-historyFor more deployment strategies (canary, blue-green, etc.), see Rollout Options.
If something goes wrong after a deployment, rollback to the previous version:
Rollback a single service:
# Rollback db-api
make k8s-rollback-db-api
# Rollback chat-api
make k8s-rollback-chatRollback all services:
make k8s-rollback-allView rollout history (to see available revisions):
make k8s-historyRollback to a specific revision (manual):
kubectl rollout undo deployment/chat-api -n carepath-demo --to-revision=2If pods are crashing:
- Check logs:
make k8s-logs s=chat-api - Check events:
kubectl describe pod <pod-name> -n carepath-demo - Rollback:
make k8s-rollback-chat
If LoadBalancer is unreachable:
- Check service:
kubectl get svc -n carepath-demo - Check pod readiness:
make k8s-pods - Restart pods:
make k8s-restart-chat
If you need to completely redeploy:
# Delete and recreate the deployment
kubectl delete deployment chat-api -n carepath-demo
make deploy-chat| Command | Description |
|---|---|
make k8s-config |
Configure kubectl for EKS |
make k8s-status |
Show all deployments, pods, services, HPA |
make k8s-get-urls |
Get service URLs |
make k8s-pods |
List all pods with details |
make k8s-logs s=SERVICE |
Stream logs (s=db-api or chat-api) |
make k8s-scale-up s=SERVICE r=N |
Scale up to N replicas |
make k8s-scale-down s=SERVICE r=N |
Scale down to N replicas |
make k8s-rollback-db-api |
Rollback db-api |
make k8s-rollback-chat |
Rollback chat-api |
make k8s-rollback-all |
Rollback both services |
make k8s-restart-db-api |
Rolling restart db-api |
make k8s-restart-chat |
Rolling restart chat-api |
make k8s-history |
View rollout history |
make deploy-db-api |
Build, push, deploy db-api |
make deploy-chat |
Build, push, deploy chat-api |
make deploy-all |
Deploy all services |
make ec2-config-status |
Show current EC2 capacity type |
make ec2-config-set-as-nodes |
Use On-Demand EC2 instances |
make ec2-config-set-as-spot |
Use SPOT instances (different quota) |
make region-status |
Show current AWS region |
make region-set-us-east-1 |
Switch to us-east-1 (N. Virginia) |
make region-set-us-east-2 |
Switch to us-east-2 (Ohio) |
make frontend-install |
Install frontend dependencies |
make frontend-dev |
Run frontend dev server |
make frontend-build |
Build frontend for production |
make frontend-deploy |
Deploy frontend to S3/CloudFront |
make frontend-invalidate-cache |
Invalidate CloudFront cache |
Error message:
Error: waiting for EKS Node Group create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'.
last error: AsgInstanceLaunchFailures: You've reached your quota for maximum Fleet Requests for this account.
Cause: Your AWS account has a quota limit on EC2 instances. New accounts often have low default limits.
Solution:
-
Delete the failed node group (it's stuck in a failed state and blocks new attempts):
aws eks delete-nodegroup \ --cluster-name carepath-demo-cluster \ --nodegroup-name carepath-demo-cluster-node-group \ --profile $DEPLOY_AWS_PROFILE \ --region $DEPLOY_AWS_REGION
-
Wait for deletion (~2 minutes). Check status:
aws eks list-nodegroups \ --cluster-name carepath-demo-cluster \ --profile $DEPLOY_AWS_PROFILE \ --region $DEPLOY_AWS_REGION
When it returns
{"nodegroups": []}, proceed. -
Choose one of these options to resolve:
Option A: Switch to SPOT instances (recommended - quick fix)
SPOT instances use a different AWS quota pool:
make ec2-config-set-as-spot
Option B: Reduce node count (if you want to stay on On-Demand)
Edit
infra/terraform/envs/demo/variables.tf:node_desired_size = 1 node_min_size = 1 node_max_size = 3
Option C: Request quota increase (takes time - hours to days)
Go to AWS Service Quotas and request an increase for "Running On-Demand Standard instances". For 3 t3.medium nodes, request at least 6 vCPUs.
Option D: Switch to a different region (if you have quota approved elsewhere)
If you have quota approved in another region (e.g., us-east-2), you can switch regions. This requires destroying existing infrastructure first. See Configuration > AWS Region for details.
make tf-destroy make region-set-us-east-2 make tf-apply
-
Re-run Terraform:
make tf-apply
See Configuration > EC2 Node Capacity Type for more details on SPOT vs On-Demand instances.
- Verify
backend.hclexists:ls infra/terraform/envs/demo/backend.hcl - Check AWS credentials:
make aws-login
- Reconfigure kubectl:
make k8s-config - Verify IAM permissions
- Check logs:
make k8s-logs s=chat-api - Check events:
kubectl describe pod <pod-name> -n carepath-demo - Check image exists:
aws ecr describe-images --repository-name carepath-chat-api
- Verify your Atlas cluster allows connections from EKS node IPs (or has 0.0.0.0/0 for demo)
- Check connection string in secret:
kubectl get secret mongodb-secret -n carepath-demo -o yaml - Verify
mongodb_uriinterraform.tfvarsis correct
- Wait 2-3 minutes after deployment
- Check service:
kubectl get svc chat-api-service -n carepath-demo - Check AWS console for ELB status
Error message:
Error: Error acquiring the state lock
Lock Info:
ID: 6811ddc7-7e48-ed54-a727-...
Operation: OperationTypeApply
Cause: A previous Terraform operation was interrupted (Ctrl+C, terminal closed, session timeout) before it could release the state lock. Terraform uses DynamoDB to prevent concurrent modifications.
Solution:
- Verify no other Terraform is running - check for other terminals/processes
- Force-unlock the state using the Lock ID from the error:
cd infra/terraform/envs/demo source .env AWS_PROFILE=$DEPLOY_AWS_PROFILE AWS_REGION=$DEPLOY_AWS_REGION \ terraform force-unlock -force <LOCK_ID>
- Retry your command:
make tf-apply
Prevention: Let Terraform operations complete fully. If you need to cancel, use Ctrl+C once and wait for graceful shutdown rather than force-killing.
To destroy all infrastructure:
make tf-destroyWarning: This deletes all resources including MongoDB cluster. Data will be lost unless backups are enabled.
- Infrastructure Operations - Day-to-day operations (logs, scaling, rollbacks) for deployed infrastructure
- Rollout Options - Deployment strategies (rolling, canary, blue-green)
- AI Service Upgrade - Deploying with real LLM
- Model Management - LLM configuration
- Infrastructure Guide - Terraform details