This project implements a distributed event-driven platform for ingesting, processing, storing, and analyzing high-throughput marketing events.
The system is built as a microservices architecture with three core services communicating asynchronously via NATS JetStream. Events are received via HTTP webhooks, processed through a message broker, persisted in PostgreSQL, and exposed through dedicated analytical APIs.
The solution demonstrates production-grade microservices design with proper service boundaries, independent deployability, and comprehensive observability.
The platform consists of three independent microservices:
1. Ingestion Service (Port 3001)
- Handles incoming HTTP webhook requests
- Validates and publishes events to NATS
- Stateless and horizontally scalable
- Exposes Prometheus metrics and health checks
2. Worker Service (Background)
- Consumes events from NATS JetStream
- Performs batch processing and persistence to PostgreSQL
- Runs multiple replicas for high throughput
- Handles deduplication and error recovery
3. Analytics Service (Port 3002)
- Provides read-only analytical APIs
- Implements complex aggregations and reporting
- Independent database connection pool
- Swagger UI documentation at
/api
Publisher → Ingestion Service → NATS JetStream → Worker Service → PostgreSQL → Analytics Service
Message broker (NATS JetStream)
- Asynchronous inter-service communication
- At-least-once delivery semantics
- Handles bursts, retries, and backpressure
- Durable consumer with shared message distribution
- Decouples services for independent scaling
Shared Database
- PostgreSQL as the single source of truth
- Workers write, Analytics service reads
- Eventual consistency via message queue
Ingestion Service:
- HTTP request validation
- Event publishing to NATS
- HTTP metrics collection
- No business logic or persistence
Worker Service:
- Pull-based JetStream consumer with configurable batch size
- High-throughput batch processing with parallel execution
- Database batch inserts for optimal performance
- Explicit acknowledgements
- Idempotent persistence with duplicate detection
- Configurable concurrency limit (default: 300 parallel operations)
- Handles 2000-5000 msg/sec per instance
- Dead Letter Queue (DLQ) for poison messages
Analytics Service:
- Read-side analytics using SQL over JSONB
- Aggregations, funnel analysis, geo breakdowns
- Separate connection pool from workers
- RESTful API with OpenAPI documentation
Observability:
- Prometheus metrics per service
- Service-specific health checks
- Graceful shutdown with consumer drain
- Centralized monitoring via Grafana
The solution is implemented as a microservices architecture to provide:
Independent Scalability
- Scale ingestion, processing, and analytics independently
- Add replicas based on specific bottlenecks
- Workers can scale out without affecting API services
Service Isolation
- Failure in one service doesn't crash others
- Independent deployments and rollbacks
- Technology flexibility per service
Clear Boundaries
- Each service has a single responsibility
- API contracts via HTTP and NATS messages
- Database schema changes isolated to worker service
Operational Benefits
- Deploy and update services independently
- Monitor and debug per-service metrics
- Team ownership of individual services
Events are published to NATS JetStream
Consumer uses:
- Explicit ACKs
- Limited in-flight messages (max_ack_pending)
- Retry on failure (nak)
Delivery semantics: at-least-once
Duplicate events are expected and handled via:
- Unique index on eventId
- Graceful duplicate handling in consumer
Out-of-order events are tolerated by design.
All events are stored in a single events table:
- source
- funnelStage
- eventType
- timestamp
- data (JSONB)
This keeps ingestion simple and flexible.
Performance optimizations:
- Batch inserts (up to 500 events per transaction)
- ON CONFLICT DO NOTHING for duplicate handling
- Automatic fallback to individual inserts on batch failure
- JSONB indexing for analytical queries
Analytical complexity is intentionally moved to the read side.
The system uses several indexes for optimal query performance:
Standard B-Tree Indexes:
idx_events_source_eventType- Composite index for filtering by source and event typeidx_events_funnelStage- Funnel analysis queriesidx_events_timestamp- Time-series queriesidx_events_eventId(UNIQUE) - Deduplication
JSONB Indexes:
idx_events_data_gin- GIN index for general JSONB queriesidx_events_campaign_id- Campaign analytics (functional index)idx_events_user_id- User analytics (functional index)idx_events_country- Geographic analytics (functional index with COALESCE)idx_events_purchase_amount- Revenue analytics
These indexes significantly improve analytics query performance on JSONB fields.
The system exposes analytical APIs including:
- Overall event statistics
- Time series of events
- Funnel analysis (top → bottom conversion)
- Country-level breakdowns
- Top campaigns and users
- Revenue aggregation
Analytics queries are implemented using raw SQL over JSONB for clarity and performance.
Each service provides its own API documentation:
Ingestion Service - Access Swagger UI:
http://localhost:3001/api
Documents webhook endpoints for event ingestion.
Analytics Service - Access Swagger UI:
http://localhost:3002/api
Documents all analytical and reporting endpoints:
- Overall event statistics
- Time series of events
- Funnel analysis (top → bottom conversion)
- Country-level breakdowns
- Top campaigns and users
- Revenue aggregation
What's documented:
- HTTP API contracts - Request/response schemas for all endpoints
- Health checks - Liveness and readiness probes per service
- Error responses - Standard error formats
What's NOT documented:
Swagger intentionally documents only the HTTP API contracts. Internal components are excluded:
- NATS message schemas and consumers
- Internal service-to-service communication
- Database schemas and migrations
- Prometheus metrics endpoints
The microservices architecture provides independent horizontal scalability:
Ingestion Service:
- Stateless HTTP service
- Scale out to handle increased webhook traffic
- Load balancer distributes requests across instances
Worker Service:
- Multiple replicas share JetStream durable consumer
- JetStream automatically distributes messages across replicas
- Scale based on queue depth and processing lag
- Each replica processes messages in parallel (up to 300 concurrent)
Analytics Service:
- Read-only service, easily cacheable
- Scale out for high query throughput
- Independent from write path
Scaling Strategy:
# Scale worker replicas
docker-compose up --scale worker-service=5
# Each service scales independently
# No code changes requiredPerformance tuning:
The worker service can be tuned via environment variables:
NATS_BATCH_SIZE(default: 5000) - Messages fetched per batchNATS_CONCURRENCY(default: 300) - Parallel processing limitmax_ack_pending(20,000) - Maximum unacknowledged messages
Throughput benchmarks:
- Single worker instance: ~2,000-5,000 msg/sec
- With batch inserts: 20-50x faster than individual inserts
- Handles 500K+ pending messages with proper configuration
- Linear scaling with additional worker replicas
Each microservice exposes independent metrics and health endpoints:
Per-Service Metrics:
Ingestion Service:
- HTTP request rates and latencies
- NATS publish success/failure rates
- Validation errors
Worker Service:
- Received, processed, failed, duplicate events
- Consumer lag and in-flight messages
- Batch processing times
- Database connection pool usage
- DLQ message counts
Analytics Service:
- Query execution times
- Database connection pool usage
- API endpoint performance
Centralized Monitoring:
- Prometheus scrapes metrics from all services
- Grafana dashboards for service-level and system-level views
- Pre-configured dashboards for the entire platform
Health Checks:
Each service implements:
/health/liveness- Service is running/health/readiness- Service is ready to accept traffic
Graceful Shutdown:
All services implement graceful shutdown:
- Ingestion service drains in-flight HTTP requests
- Worker service acknowledges pending messages
- Analytics service completes active queries
The complete microservices platform can be started with a single command:
docker-compose upThis orchestrates and starts all services:
Core Microservices:
- Ingestion Service (http://localhost:3001) - Webhook endpoint
- Worker Service (multiple replicas) - Event processing
- Analytics Service (http://localhost:3002) - Analytics API
- Swagger UI (http://localhost:3002/api) - API documentation
Infrastructure:
- NATS JetStream - Message broker
- PostgreSQL - Data persistence
- Event publisher - Load testing tool
Monitoring Stack:
- Prometheus (http://localhost:9090) - Metrics collection
- Grafana (http://localhost:3003) - Dashboards and visualization
| Service | Port | Purpose |
|---|---|---|
| Ingestion API | 3001 | POST /webhook/event |
| Analytics API | 3002 | GET /analytics/* |
| Swagger UI | 3002 | /api |
| Prometheus | 9090 | Metrics |
| Grafana | 3003 | Dashboards |
Environment variables:
Each service can be configured independently:
Worker Service:
NATS_BATCH_SIZE- Batch size for consumer (default: 5000)NATS_CONCURRENCY- Parallel processing limit (default: 300)DB_POOL_MAX- Maximum database connections (default: 75)DB_POOL_MIN- Minimum database connections (default: 10)
Analytics Service:
DB_POOL_MAX- Maximum database connections (default: 50)DB_POOL_MIN- Minimum database connections (default: 10)
All Services:
DATABASE_HOST- PostgreSQL hostNATS_URL- NATS connection URLPORT- HTTP port (service-specific defaults)
Key metrics to monitor:
- Active connections
- Cache hit ratio (should be > 95%)
- Query execution times
- Table bloat and vacuum status
Access monitoring tools:
- Prometheus metrics: http://localhost:9090
- Grafana dashboards: http://localhost:3003
The provided publisher may occasionally fail with network errors under burst load. This is expected and does not indicate issues in the backend.
In a production setup, retries and backoff would be expected on the producer side.
This project demonstrates a production-grade microservices platform focused on:
Architectural Principles:
- Clear service boundaries and single responsibilities
- Asynchronous communication via message broker
- Independent scalability and deployability
- Fault isolation between services
Technical Excellence:
- Reliability through message acknowledgements and retries
- High throughput via batch processing
- Idempotent operations and duplicate detection
- Comprehensive observability per service
Operational Benefits:
- Docker Compose orchestration for local development
- Service-level monitoring and health checks
- Horizontal scaling without code changes
- Production-oriented design patterns
The goal is to demonstrate sound microservices engineering practices with practical trade-offs between complexity and maintainability.