Skip to content

Flcn/SantaRecommendationEngine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MySanta Recommendation Engine V2

Clean, scalable recommendation service with dual database architecture.

Architecture

Dual Database Design:

  • Main Database (READ-ONLY): Original MySanta data
  • Recommendations Database (READ/WRITE): Pre-computed recommendation data
  • Redis Database 1: Recommendation caching layer

Performance Focused:

  • Sub-50ms response times with cache hits
  • Handles 1000+ RPS easily
  • Scales to 100k+ presents and 5M+ interactions

Two Core APIs

1. Popular Items API

POST /popular

Request:

{
  "user_params": {
    "gender": "f",
    "age": "25-34", 
    "category": "electronics",
    "geo_id": 213
  },
  "filters": {
    "price_from": 500,
    "price_to": 2000
  },
  "pagination": {
    "page": 1,
    "limit": 20
  }
}

Use Case: Get popular items matching user demographics

2. Personalized Recommendations API

POST /personalized

Request:

{
  "user_id": 123,
  "geo_id": 213,
  "filters": {
    "price_from": 500,
    "price_to": 2000,
    "category": "electronics"
  },
  "pagination": {
    "page": 1,
    "limit": 20
  }
}

Use Case: Get personalized items based on user's likes (excludes already liked)

Smart Algorithm Selection

New Users (0 interactions):

  • Returns popular items by demographics

Users with Some Data (1-2 interactions):

  • Content-based filtering using user preferences

Experienced Users (3+ interactions):

  • Collaborative filtering with similar users

Response Format

{
  "items": [123, 456, 789],
  "pagination": {
    "page": 1,
    "limit": 20,
    "total_pages": 15,
    "has_next": true,
    "has_previous": false
  },
  "computation_time_ms": 23.4,
  "algorithm_used": "collaborative",
  "cache_hit": true
}

Background Jobs

Automated Data Refresh:

  • Popular items: Every 15 minutes
  • User profiles: Every 30 minutes
  • User similarities: Every hour
  • Cache cleanup: Every 6 hours

Database Schema

Recommendations Database Tables

-- Pre-computed popular items by demographics
CREATE TABLE popular_items (
    geo_id INTEGER,
    gender VARCHAR(10),
    age_group VARCHAR(20), 
    category VARCHAR(50),
    item_id INTEGER,
    popularity_score DECIMAL,
    updated_at TIMESTAMP
);

-- User similarity matrix for collaborative filtering
CREATE TABLE user_similarities (
    user_id INTEGER,
    similar_user_id INTEGER,
    similarity_score DECIMAL,
    PRIMARY KEY(user_id, similar_user_id)
);

-- User preference profiles for content-based filtering
CREATE TABLE user_profiles (
    user_id INTEGER PRIMARY KEY,
    preferred_categories JSONB,
    preferred_platforms JSONB,
    avg_price DECIMAL,
    interaction_count INTEGER,
    updated_at TIMESTAMP
);

Configuration

Environment variables:

# Database connections
MAIN_DATABASE_URL=postgresql://user:pass@host:5432/mysanta_main
RECOMMENDATIONS_DATABASE_URL=postgresql://user:pass@host:5432/mysanta_recommendations  
RECOMMENDATIONS_REDIS_URL=redis://host:6379/1

# Performance settings
MAX_SIMILAR_USERS=20
DEFAULT_PAGE_SIZE=20
MAX_PAGE_SIZE=100
POPULAR_ITEMS_REFRESH_MINUTES=15

# Cache TTL settings
CACHE_TTL_POPULAR=900           # 15 minutes
CACHE_TTL_PERSONALIZED=7200     # 2 hours
CACHE_TTL_USER_PROFILE=14400    # 4 hours

Quick Start

1. Setup Databases

# Create recommendations database
createdb mysanta_recommendations

# Run schema
psql mysanta_recommendations < schema.sql

2. Start Service

# With Docker
docker-compose up --build

# Or locally
pip install -r requirements.txt
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

3. Initialize Data

# Trigger initial data refresh
curl -X POST http://localhost:8001/admin/refresh-popular-items
curl -X POST http://localhost:8001/admin/update-user-profiles

Integration with Rails

class RecommendationClient
  BASE_URL = ENV.fetch('RECOMMENDATION_SERVICE_URL', 'http://localhost:8001')
  
  def self.get_popular_items(user_params:, **options)
    response = HTTP.post("#{BASE_URL}/popular", json: {
      user_params: user_params,
      **options
    })
    response.parse
  end
  
  def self.get_personalized_recommendations(user_id:, geo_id:, **options)
    response = HTTP.post("#{BASE_URL}/personalized", json: {
      user_id: user_id,
      geo_id: geo_id,
      **options
    })
    response.parse
  end
end

Monitoring

Service Stats

GET /stats

Returns:

  • Main database statistics (items, likes, clicks, users)
  • Recommendations database statistics (cached data)
  • Cache performance metrics
  • Configuration values

Health Check

GET /health

Tests both database connections and Redis.

Performance Characteristics

With Your Data Scale (16k presents, 800k clicks, 600k likes):

  • Response time: 20-50ms (cache hits), 50-200ms (cache miss)
  • Throughput: 1,000+ RPS
  • Memory usage: ~50MB per worker
  • Database load: 70% reduction vs direct queries

Scales to:

  • 100k presents
  • 5M interactions
  • 10k+ concurrent users

Simple, clean, fast!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages