Skip to content

[GOV-G1] Infrastructure Cost Attribution Framework #212

@ericsocrat

Description

@ericsocrat

Infrastructure Cost Attribution Framework

Workstream: G — Cost & Resource Monitoring
Parent: #195 (Execution Governance Blueprint)
Classification: Optional / Premature
Effort: 2 days
Phase: Scale (Phase 5)


⚠️ Over-Engineering Warning

This issue is classified as Optional/Premature per the governance blueprint (#195).

The platform does not yet have enough operational scale to justify a dedicated cost attribution framework. This issue should NOT be started until:

  1. Monthly Supabase costs exceed $100+
  2. CI pipeline runtime exceeds 30 min/run
  3. Event ingestion volume exceeds 100k events/month
  4. At least 3 country expansions are live

Current recommendation: DEFER indefinitely. Revisit when any threshold above is breached.

If premature implementation is detected, this issue should be closed with a "deferred — premature" label.


Problem Statement (For Future Reference)

As the platform scales beyond Poland to multiple countries (#189-#193 multi-country support), infrastructure costs will grow across three dimensions:

  1. Database query costs — Supabase compute usage from complex scoring, search, and provenance queries
  2. CI/CD runtime costs — GitHub Actions minutes for expanded test suite and multi-country pipelines
  3. Event ingestion costs — Volume-based costs for analytics event storage ([ARCH] Event-Based Product Intelligence Layer — Behavioral Telemetry, ML Readiness & A/B Infrastructure #190)

Without attribution, cost increases cannot be traced to specific workstreams, making optimization decisions uninformed.

Scope (When Activated)

1. Query Cost Attribution

-- Track query cost by domain using pg_stat_statements
-- Map queryid → domain using naming conventions from #197

CREATE OR REPLACE VIEW v_query_cost_by_domain AS
SELECT
  CASE
    WHEN query ILIKE '%scoring%' OR query ILIKE '%health_score%' THEN 'scoring'
    WHEN query ILIKE '%search%' OR query ILIKE '%tsvector%' THEN 'search'
    WHEN query ILIKE '%provenance%' THEN 'provenance'
    WHEN query ILIKE '%event%' OR query ILIKE '%analytics%' THEN 'events'
    WHEN query ILIKE '%flag%' OR query ILIKE '%feature%' THEN 'flags'
    ELSE 'other'
  END AS domain,
  count(*) AS query_count,
  ROUND(SUM(total_exec_time)::NUMERIC, 2) AS total_exec_ms,
  ROUND(SUM(total_exec_time)::NUMERIC / NULLIF(SUM(calls), 0), 2) AS avg_ms_per_call,
  SUM(shared_blks_read) AS total_disk_reads,
  SUM(shared_blks_hit) AS total_cache_hits
FROM pg_stat_statements
WHERE calls > 0
GROUP BY 1
ORDER BY total_exec_ms DESC;

2. CI Runtime Tracking

# Concept: Add timing annotations to CI workflow
# Track per-job duration and map to workstreams

# Example reporting step:
- name: Report CI Timing
  run: |
    echo "## CI Runtime Report" >> $GITHUB_STEP_SUMMARY
    echo "| Job | Duration | Workstream |" >> $GITHUB_STEP_SUMMARY
    echo "|---|---|---|" >> $GITHUB_STEP_SUMMARY
    echo "| SQL QA Tests | ${SQL_QA_DURATION}s | testing |" >> $GITHUB_STEP_SUMMARY
    echo "| Frontend Tests | ${FRONTEND_DURATION}s | frontend |" >> $GITHUB_STEP_SUMMARY
    echo "| E2E Tests | ${E2E_DURATION}s | testing |" >> $GITHUB_STEP_SUMMARY
    echo "| Migration Check | ${MIGRATION_DURATION}s | migration |" >> $GITHUB_STEP_SUMMARY

3. Event Ingestion Cost Monitoring

-- Track event volume by type for cost projection
CREATE OR REPLACE VIEW v_event_volume AS
SELECT
  event_type,
  DATE_TRUNC('day', created_at) AS event_date,
  count(*) AS event_count,
  pg_size_pretty(SUM(pg_column_size(payload))::BIGINT) AS payload_size
FROM events  -- table from #190
GROUP BY 1, 2
ORDER BY event_date DESC, event_count DESC;

-- Monthly projection
CREATE OR REPLACE VIEW v_event_cost_projection AS
SELECT
  DATE_TRUNC('month', created_at) AS month,
  count(*) AS total_events,
  pg_size_pretty(SUM(pg_column_size(payload))::BIGINT) AS total_payload_size,
  -- Rough cost estimate: Supabase Pro = included storage, but compute matters
  ROUND(count(*) / 1000000.0 * 5, 2) AS estimated_monthly_cost_usd  -- $5 per million events (adjust)
FROM events
GROUP BY 1
ORDER BY 1 DESC;

4. Cost Dashboard Card (for #206)

// Addition to Admin Health Dashboard (#206)
// Card: "Infrastructure Costs"
// Shows:
// - Query compute: XX ms total this week (by domain)
// - CI runtime: XX min this week (trend)
// - Event volume: XX events this month (projection)
// - Storage: XX GB used / XX GB plan limit

// Thresholds:
// Green: <50% of budget
// Amber: 50-80% of budget
// Red: >80% of budget

Acceptance Criteria (When Activated)

  • v_query_cost_by_domain view created and accurate
  • CI runtime tracking added to GitHub Actions workflow
  • v_event_volume and v_event_cost_projection views created
  • Cost dashboard card added to admin health dashboard ([GOV-D2] Admin Governance Dashboard Suite #206)
  • Monthly cost report template documented
  • Budget thresholds defined and alerts configured

Activation Triggers

This issue transitions from "Deferred" to "Active" when ANY of these thresholds is breached:

Trigger Threshold How to Detect
Monthly Supabase cost >$100 Supabase billing dashboard
CI pipeline runtime >30 min/run GitHub Actions summary
Event volume >100k events/month SELECT count(*) FROM events WHERE created_at > NOW() - INTERVAL '30 days'
Country count ≥3 countries live SELECT DISTINCT country FROM products
Database size >5GB SELECT pg_size_pretty(pg_database_size(current_database()))

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    deferredPostponed to a future sprintgovernanceProject governance / processmonitoringMonitoring and alerting

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions