Deployment Guide

This guide walks through deploying CarePath to AWS EKS with MongoDB Atlas.

Prerequisites
Environment Setup
Configuration
Step-by-Step Deployment
Finding Service URLs
Frontend Deployment
Verification
Operations & Contingencies
Troubleshooting

Prerequisites

AWS CLI configured with SSO profile
Terraform >= 1.0
Docker
kubectl
Node.js 18+ and npm (for frontend)
MongoDB Atlas account with organization ID

Environment Setup

1. Configure Environment Variables

Copy .env.example to .env and configure:

cp .env.example .env

Key variables for deployment:

# AWS Configuration
DEPLOY_AWS_REGION=us-east-1
DEPLOY_AWS_PROFILE=your-sso-profile
DEPLOY_AWS_ACCOUNT_ID=123456789012

# Terraform State (can reuse existing backend from other projects)
DEPLOY_TF_STATE_BUCKET_NAME=your-terraform-state-bucket
DEPLOY_TF_DYNAMO_DB_TABLE=your-terraform-state-locks

# MongoDB (your existing cluster connection string)
MONGODB_URI=mongodb+srv://user:password@your-cluster.mongodb.net/?retryWrites=true&w=majority
MONGODB_DB_NAME=carepath

2. Configure Terraform Backend

Create the backend configuration file:

cp infra/terraform/envs/demo/backend.hcl.example infra/terraform/envs/demo/backend.hcl

Edit backend.hcl with your values:

bucket         = "your-terraform-state-bucket"    # e.g., genonaut-terraform-state
key            = "carepath/demo/terraform.tfstate"
region         = "us-east-1"
dynamodb_table = "your-terraform-state-locks"     # e.g., genonaut-terraform-state-locks
encrypt        = true

Note: You can safely reuse an existing Terraform state backend from other projects. Each project uses a unique key path, so state files don't interfere with each other.

3. Configure Terraform Variables

Create the variables file:

cp infra/terraform/envs/demo/terraform.tfvars.example infra/terraform/envs/demo/terraform.tfvars

Edit terraform.tfvars with your MongoDB URI:

environment = "demo"
aws_region  = "us-east-1"

# Use your existing MongoDB cluster
mongodb_uri           = "mongodb+srv://user:password@your-cluster.mongodb.net/?retryWrites=true&w=majority"
mongodb_database_name = "carepath"

Note: If you want Terraform to create a new MongoDB Atlas cluster instead, set create_mongodb_atlas = true and provide the Atlas API keys. See terraform.tfvars.example for details.

Configuration

EC2 Node Capacity Type

EKS worker nodes can run as either On-Demand or Spot instances. This setting affects AWS quota usage and cost.

Type	Description	When to Use
`ON_DEMAND`	Standard EC2 instances	Production, stable workloads
`SPOT`	Discounted instances (can be interrupted)	Development, demos, cost savings

Why SPOT exists: New AWS accounts have restrictive quotas on On-Demand EC2 "Fleet Requests." If you hit quota limits during deployment, switching to SPOT instances uses a different quota pool and often resolves the issue. SPOT instances are also 60-90% cheaper.

Check current setting:

make ec2-config-status

Switch to SPOT instances:

make ec2-config-set-as-spot

Switch to On-Demand (standard) instances:

make ec2-config-set-as-nodes

Note: After changing this setting, you must delete any existing/failed node group and run make tf-apply for changes to take effect. See Troubleshooting for details.

DB API External Access

By default in the demo environment, the db-api is exposed externally via a LoadBalancer so you can query it directly (useful for demos and frontend development). In production, you'd typically keep it internal-only.

Setting	Service Type	Access
`expose_db_api = true`	LoadBalancer	Publicly accessible via ELB URL
`expose_db_api = false`	ClusterIP	Internal only (within cluster)

Check current setting:

kubectl get svc db-api-service -n carepath-demo -o jsonpath='{.spec.type}'

To change: Edit infra/terraform/envs/demo/variables.tf:

variable "expose_db_api" {
  ...
  default = true   # true = external, false = internal only
}

Then run make tf-apply.

Security Note: When expose_db_api = true, anyone can query your database API. This is fine for demos but for production you should:

Keep it internal (expose_db_api = false)
Or add authentication/API keys
Or use an API Gateway with auth

AWS Region

Infrastructure can be deployed to different AWS regions. This is useful if you hit quota limits in one region but have approved quotas in another.

Region	Location	Notes
`us-east-1`	N. Virginia	Default, most services available
`us-east-2`	Ohio	Alternative if us-east-1 quotas are exhausted

Check current region:

make region-status

Switch to us-east-2 (Ohio):

make region-set-us-east-2

Switch to us-east-1 (N. Virginia):

make region-set-us-east-1

Important: Changing regions is a destructive operation. You must:

Destroy existing infrastructure first:
```
make tf-destroy
```
Run the region switch command (it will prompt for confirmation)
Deploy to the new region:
```
make tf-apply
```

Note: The Terraform state bucket (in backend.hcl) stays in us-east-1 regardless of where you deploy. S3 buckets are globally accessible, so you don't need to change it or run bootstrap commands again.

Step-by-Step Deployment

Step 1: AWS Login

make aws-login

This authenticates via AWS SSO and verifies your credentials.

Step 2: Initialize Terraform

make tf-init

This initializes Terraform with your backend configuration.

Step 3: Review Infrastructure Plan

make tf-plan

Review the planned changes carefully. This will show you what resources will be created:

VPC with public/private subnets
EKS cluster with managed node group
ECR repositories for Docker images
MongoDB Atlas cluster
Kubernetes namespace, deployments, and services

Step 4: Apply Infrastructure

make tf-apply

This creates all the infrastructure. This may take 15-20 minutes (EKS cluster creation is slow).

Note: If you encounter errors during this step, see the Troubleshooting section for common issues and solutions.

Step 5: Configure kubectl

After Terraform completes, configure kubectl to connect to your EKS cluster:

make k8s-config

Step 6: Build and Push Docker Images

# Build both images
make docker-build-db-api
make docker-build-chat

# Push both to ECR
make docker-push-db-api
make docker-push-chat

Step 7: Deploy Services to Kubernetes

# Deploy both services
make deploy-all

Or deploy individually:

make deploy-db-api
make deploy-chat

Finding Service URLs

After deployment, get the service URLs:

make k8s-get-urls

This shows:

Chat API URL: External LoadBalancer URL (publicly accessible)
DB API URL: External LoadBalancer URL if expose_db_api = true (default for demo), otherwise internal ClusterIP

Note: LoadBalancer URLs may take 2-3 minutes to provision after deployment.

You can also check all resources:

make k8s-status

Frontend Deployment

The CarePath frontend is a React app hosted on AWS S3 with CloudFront CDN. This provides a simple web UI for chatting with the CarePath AI and viewing chat history.

Prerequisites

Node.js 18+ and npm
Backend APIs already deployed (db-api and chat-api)
Terraform frontend infrastructure created (make tf-apply)

Configuration

NEW: As of 2024-11-24, the frontend build now automatically fetches API URLs from Terraform outputs. You no longer need to manually configure .env for deployments!

The .env file in frontend_chat/ is now only used for local development. For production deployments, make frontend-deploy automatically:

Fetches current load balancer URLs from Terraform
Builds the frontend with those URLs
Deploys to S3

If you want to override this behavior for local development, create a .env file:

cd frontend_chat
cp .env.example .env

Edit .env with your API URLs:

VITE_DB_API_URL=http://your-db-api-loadbalancer.elb.amazonaws.com
VITE_CHAT_API_URL=http://your-chat-api-loadbalancer.elb.amazonaws.com

Deploy Frontend

# Install dependencies (first time only)
make frontend-install

# Build and deploy to S3/CloudFront
# This automatically uses current Terraform outputs for API URLs
make frontend-deploy

This will:

Fetch current API load balancer URLs from Terraform outputs
Build the React app for production with those URLs
Sync the build output to S3
Invalidate the CloudFront cache (if CloudFront enabled)
Print the frontend URL

Note: When you run make deploy-chat, the frontend is automatically redeployed with updated API URLs. This ensures the frontend always points to the correct backend services.

Local Development

To run the frontend locally:

make frontend-dev

The dev server starts at http://localhost:5173 with hot reload.

Frontend Makefile Commands

Command	Description
`make frontend-install`	Install npm dependencies
`make frontend-dev`	Run local dev server
`make frontend-build`	Build for production
`make frontend-deploy`	Build and deploy to S3/CloudFront
`make frontend-invalidate-cache`	Invalidate CloudFront cache

Frontend Configuration Variables

The following Terraform variables control frontend deployment:

Variable	Default	Description
`expose_frontend`	`true`	Whether to create S3/CloudFront resources
`frontend_bucket_name`	`carepath-demo-frontend`	S3 bucket name for static files

Verification

1. Check Deployment Status

make k8s-status

All pods should show Running status with 1/1 ready.

2. Check Pod Logs

# View db-api logs
make k8s-logs s=db-api

# View chat-api logs
make k8s-logs s=chat-api

3. Test Health Endpoints

Get the Chat API URL first:

make k8s-get-urls

Then test:

# Replace with your actual LoadBalancer URL
CHAT_URL="http://your-loadbalancer-url.elb.amazonaws.com"

# Test chat-api health
curl $CHAT_URL/health

# Test db-api health (via chat-api, since db-api is internal)
# The chat-api calls db-api internally, so if triage works, both are healthy

4. Test the Triage Endpoint

curl -X POST $CHAT_URL/triage \
  -H "Content-Type: application/json" \
  -d '{"patient_mrn": "P000123", "query": "What are my current medications?"}'

Expected response includes response, trace_id, conversation_id, etc.

Operations & Contingencies

(i) Scaling Up/Down

The deployments have HPA (Horizontal Pod Autoscaler) configured, but you can manually scale:

Scale Up:

# Scale db-api to 3 replicas
make k8s-scale-up s=db-api r=3

# Scale chat-api to 3 replicas
make k8s-scale-up s=chat-api r=3

Scale Down:

# Scale db-api to 1 replica
make k8s-scale-down s=db-api r=1

# Scale chat-api to 1 replica
make k8s-scale-down s=chat-api r=1

Check current scale:

make k8s-status

(ii) Rolling Out Updates

When you make code changes and want to deploy:

Option A: Deploy a single service

# This builds, pushes, and deploys in one command
make deploy-db-api   # or
make deploy-chat

Option B: Deploy all services

make deploy-all

Option C: Just restart pods (same image)

make k8s-restart-db-api
make k8s-restart-chat

View rollout history:

make k8s-history

For more deployment strategies (canary, blue-green, etc.), see Rollout Options.

(iii) Rollbacks

If something goes wrong after a deployment, rollback to the previous version:

Rollback a single service:

# Rollback db-api
make k8s-rollback-db-api

# Rollback chat-api
make k8s-rollback-chat

Rollback all services:

make k8s-rollback-all

View rollout history (to see available revisions):

make k8s-history

Rollback to a specific revision (manual):

kubectl rollout undo deployment/chat-api -n carepath-demo --to-revision=2

(iv) Emergency Procedures

If pods are crashing:

Check logs: make k8s-logs s=chat-api
Check events: kubectl describe pod <pod-name> -n carepath-demo
Rollback: make k8s-rollback-chat

If LoadBalancer is unreachable:

Check service: kubectl get svc -n carepath-demo
Check pod readiness: make k8s-pods
Restart pods: make k8s-restart-chat

If you need to completely redeploy:

# Delete and recreate the deployment
kubectl delete deployment chat-api -n carepath-demo
make deploy-chat

Makefile Command Reference

Command	Description
`make k8s-config`	Configure kubectl for EKS
`make k8s-status`	Show all deployments, pods, services, HPA
`make k8s-get-urls`	Get service URLs
`make k8s-pods`	List all pods with details
`make k8s-logs s=SERVICE`	Stream logs (s=db-api or chat-api)
`make k8s-scale-up s=SERVICE r=N`	Scale up to N replicas
`make k8s-scale-down s=SERVICE r=N`	Scale down to N replicas
`make k8s-rollback-db-api`	Rollback db-api
`make k8s-rollback-chat`	Rollback chat-api
`make k8s-rollback-all`	Rollback both services
`make k8s-restart-db-api`	Rolling restart db-api
`make k8s-restart-chat`	Rolling restart chat-api
`make k8s-history`	View rollout history
`make deploy-db-api`	Build, push, deploy db-api
`make deploy-chat`	Build, push, deploy chat-api
`make deploy-all`	Deploy all services
`make ec2-config-status`	Show current EC2 capacity type
`make ec2-config-set-as-nodes`	Use On-Demand EC2 instances
`make ec2-config-set-as-spot`	Use SPOT instances (different quota)
`make region-status`	Show current AWS region
`make region-set-us-east-1`	Switch to us-east-1 (N. Virginia)
`make region-set-us-east-2`	Switch to us-east-2 (Ohio)
`make frontend-install`	Install frontend dependencies
`make frontend-dev`	Run frontend dev server
`make frontend-build`	Build frontend for production
`make frontend-deploy`	Deploy frontend to S3/CloudFront
`make frontend-invalidate-cache`	Invalidate CloudFront cache

Troubleshooting

EKS Node Group creation fails with quota error

Error message:

Error: waiting for EKS Node Group create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'.
last error: AsgInstanceLaunchFailures: You've reached your quota for maximum Fleet Requests for this account.

Cause: Your AWS account has a quota limit on EC2 instances. New accounts often have low default limits.

Solution:

Delete the failed node group (it's stuck in a failed state and blocks new attempts):

aws eks delete-nodegroup \
  --cluster-name carepath-demo-cluster \
  --nodegroup-name carepath-demo-cluster-node-group \
  --profile $DEPLOY_AWS_PROFILE \
  --region $DEPLOY_AWS_REGION

Wait for deletion (~2 minutes). Check status:

aws eks list-nodegroups \
  --cluster-name carepath-demo-cluster \
  --profile $DEPLOY_AWS_PROFILE \
  --region $DEPLOY_AWS_REGION

When it returns {"nodegroups": []}, proceed.

Choose one of these options to resolve:

Option A: Switch to SPOT instances (recommended - quick fix)

SPOT instances use a different AWS quota pool:
```
make ec2-config-set-as-spot
```
Option B: Reduce node count (if you want to stay on On-Demand)

Edit infra/terraform/envs/demo/variables.tf:
```
node_desired_size = 1
node_min_size     = 1
node_max_size     = 3
```
Option C: Request quota increase (takes time - hours to days)

Go to AWS Service Quotas and request an increase for "Running On-Demand Standard instances". For 3 t3.medium nodes, request at least 6 vCPUs.

Option D: Switch to a different region (if you have quota approved elsewhere)

If you have quota approved in another region (e.g., us-east-2), you can switch regions. This requires destroying existing infrastructure first. See Configuration > AWS Region for details.
```
make tf-destroy
make region-set-us-east-2
make tf-apply
```
Re-run Terraform:
```
make tf-apply
```

See Configuration > EC2 Node Capacity Type for more details on SPOT vs On-Demand instances.

Terraform init fails

Verify backend.hcl exists: ls infra/terraform/envs/demo/backend.hcl
Check AWS credentials: make aws-login

EKS authentication issues

Reconfigure kubectl: make k8s-config
Verify IAM permissions

Pod not starting

Check logs: make k8s-logs s=chat-api
Check events: kubectl describe pod <pod-name> -n carepath-demo
Check image exists: aws ecr describe-images --repository-name carepath-chat-api

MongoDB connection issues

Verify your Atlas cluster allows connections from EKS node IPs (or has 0.0.0.0/0 for demo)
Check connection string in secret: kubectl get secret mongodb-secret -n carepath-demo -o yaml
Verify mongodb_uri in terraform.tfvars is correct

LoadBalancer URL not available

Wait 2-3 minutes after deployment
Check service: kubectl get svc chat-api-service -n carepath-demo
Check AWS console for ELB status

Terraform state lock error

Error message:

Error: Error acquiring the state lock
Lock Info:
  ID:        6811ddc7-7e48-ed54-a727-...
  Operation: OperationTypeApply

Cause: A previous Terraform operation was interrupted (Ctrl+C, terminal closed, session timeout) before it could release the state lock. Terraform uses DynamoDB to prevent concurrent modifications.

Solution:

Verify no other Terraform is running - check for other terminals/processes

Force-unlock the state using the Lock ID from the error:

cd infra/terraform/envs/demo
source .env
AWS_PROFILE=$DEPLOY_AWS_PROFILE AWS_REGION=$DEPLOY_AWS_REGION \
  terraform force-unlock -force <LOCK_ID>

Retry your command:
```
make tf-apply
```

Prevention: Let Terraform operations complete fully. If you need to cancel, use Ctrl+C once and wait for graceful shutdown rather than force-killing.

Teardown

To destroy all infrastructure:

make tf-destroy

Warning: This deletes all resources including MongoDB cluster. Data will be lost unless backups are enabled.

FilesExpand file tree

deploy.md

Latest commit

History

deploy.md

File metadata and controls

Deployment Guide

Table of Contents

Prerequisites

Environment Setup

1. Configure Environment Variables

2. Configure Terraform Backend

3. Configure Terraform Variables

Configuration

EC2 Node Capacity Type

DB API External Access

AWS Region

Step-by-Step Deployment

Step 1: AWS Login

Step 2: Initialize Terraform

Step 3: Review Infrastructure Plan

Step 4: Apply Infrastructure

Step 5: Configure kubectl

Step 6: Build and Push Docker Images

Step 7: Deploy Services to Kubernetes

Finding Service URLs

Frontend Deployment

Prerequisites

Configuration

Deploy Frontend

Local Development

Frontend Makefile Commands

Frontend Configuration Variables

Verification

1. Check Deployment Status

2. Check Pod Logs

3. Test Health Endpoints

4. Test the Triage Endpoint

Operations & Contingencies

(i) Scaling Up/Down

(ii) Rolling Out Updates

(iii) Rollbacks

(iv) Emergency Procedures

Makefile Command Reference

Troubleshooting

EKS Node Group creation fails with quota error

Terraform init fails

EKS authentication issues

Pod not starting

MongoDB connection issues

LoadBalancer URL not available

Terraform state lock error

Teardown

Related Documentation