This project is a scalable, event-driven system that powers real-time product recommendations and clickstream analytics for an e-commerce platform. It combines a Go API, Next.js frontend, Kafka, Redis, and an automated offline ML training pipeline built with Ray and MLflow.
Branches:
- frontend: Contains Next.js frontend
- api: Contains Go API, Kafka and Redis setup
- recommendation: Contains Ray and MLFlow setup for training recommendation model
- Frontend: Next.js
- Backend: Go (Gin)
- Event Store: Apache Kafka
- Cache: Redis
- ML pipeline: Ray, MLflow
- Containerization: Docker / Docker Compose
-
Frontend:
- Next.js app for product exploration, user browsing, and collecting clickstream events.
-
Backend API:
-
Go (Gin) service providing:
/searchendpoint for product search & recommendations./trackendpoint to log user clickstream events./analyzeendpoint to fetch real-time analytics.
-
-
Streaming & caching:
-
Kafka handles event ingestion.
-
Redis caches:
- Precomputed recommendations.
- Aggregated real-time analytics.
-
-
Offline recommendation pipeline:
- Ray tasks consume events from Kafka and train recommendation models.
- MLflow tracks experiments, hyperparameters, and model metrics.
- Updated recommendations are stored in Redis for low-latency serving.
| Component | Tech | Purpose |
|---|---|---|
| Frontend | Next.js | User interface & event tracking |
| API | Go (Gin) | REST endpoints to serve data & track events |
| Messaging | Kafka | Stream clickstream data to ML pipeline |
| Cache | Redis | Low-latency store for recommendations & analytics |
| ML pipeline | Ray + MLflow | Train & track recommendation models offline |
| Endpoint | Method | Description |
|---|---|---|
/search |
GET | Fetch product data & personalized recommendations |
/track |
POST | Track user clickstream events |
/analyze |
GET | Fetch real-time analytics data |
- Users interact with the frontend β clickstream events sent to
/track. - Go API produces these events to Kafka.
- Offline Ray jobs consume events β train/update recommendation models.
- MLflow tracks metrics & experiments.
- Updated recommendations are saved to Redis.
- Frontend requests
/searchβ Go API retrieves recommendations from Redis.
- Clickstream events are cached and aggregated in Redis.
- Frontend queries
/analyzeto get real-time stats (e.g., trending products, active users).
- Event-driven, loosely coupled architecture.
- Low latency product recommendations via caching.
- Scalable offline training with Ray.
- ML experiment tracking with MLflow.
- Real-time analytics for high-engagement products.
# Clone the repo
git clone https://github.com/raoashish10/Ecommerce-clickstream-analytics.git
cd Ecommerce-clickstream-analytics
# Start Go API
git checkout api
docker-compose up -d
# Start Recommendation pipeline
git checkout recommendation
docker-compose up -d
# Start Next.js frontend
git checkout frontend
npm install
npm run dev