Skip to content

Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers#53

Open
danyelf wants to merge 36 commits intoduckdbfrom
mega
Open

Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers#53
danyelf wants to merge 36 commits intoduckdbfrom
mega

Conversation

@danyelf
Copy link

@danyelf danyelf commented Mar 16, 2026

Summary

  • Adds 9 new seed datasets (categories, employees, order items, products, promotions, reviews, stores, supply orders, order promotions) with a Python generation script
  • Adds 9 staging models, 25 intermediate models, 32 mart models, and 15 metrics/reporting models organized in a layered architecture
  • Includes schema.yml documentation for intermediate and metrics layers
  • Adds recce.yml with preset checks for validating model changes

Details

This expands the jaffle shop from ~10 models to ~99 nodes, covering product catalog, store operations, supply chain, customer analytics, promotions, reviews, and executive reporting domains. The layered architecture (staging → intermediate → marts → metrics) follows dbt best practices.

Test plan

  • Run dbt seed to load all seed data
  • Run dbt run to build all models
  • Run dbt test to validate schema tests
  • Verify model counts match expectations (~99 nodes)

🤖 Generated with Claude Code

QMalcolm and others added 30 commits March 11, 2024 11:55
We wanted to move this project to the latest dbt-core version to ensure
it operates on a version of dbt-core that has addressed the security
issue (CVE-2024-22195) with Jinja2. By association we also had to upgrade
the version of dbt-duckdb being used. Tangentially we also upgraded the
version of sqlfluff.
…ncies

The `requirements.txt` was regenerated by first deleting the existing
`requirements.txt` and then running `$ pip-compile`.
…exclude-Jinja2-3.1.2-new

Upgrade Jinja2 dependency version specification to address CVE-2024-22195
Updating requirements.txt to support dbt Core version 1.8
* generate new requirements.txt, update readme to use duckdb CLI

* Update version ranges and pinnned versions

* Upgrade from Python 3.8 to Python 3.12

* Replace `duckcli` with `duckdb` CLI

* Restore `duckcli` requirement

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* simplify extension startup

* strict validation of yaml
Updated action references from tags/branches to specific commit SHAs for improved security and reproducibility.
…44130921

Pin GitHub Actions to specific SHAs (10 actions in 1 files)
…po-link

bugfix: updated overview repo link
Expands project from 9 to 99 dbt nodes across 5 layers for Recce
demo/testing: 12 seeds, 12 staging, 25 intermediate, 35 mart, and
15 reporting models with diamond dependencies, incrementals, and
interesting CTEs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create scripts/generate_seeds.py that generates referentially-valid
seed data for categories, products, order items, stores, employees,
promotions, order-promotions, supply orders, and reviews. Uses
random.seed(42) for reproducibility and reads existing raw_orders.csv
to ensure review customer_ids match order ownership.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create staging models for products, categories, order_items, stores,
employees, promotions, order_promotions, supply_orders, and reviews.
Update schema.yml with tests for new models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create the intermediate layer with category hierarchy, product-category
joins, margin calculations, and enriched order items. Fix stg_categories
to reference parent_category_id directly since the seed already types it
as INTEGER (empty CSV values become NULL).

Models added:
- int_category_hierarchy (recursive CTE for category tree)
- int_products_with_categories (products joined with category hierarchy)
- int_product_margins (margin and margin_pct calculations)
- int_order_items_with_products (order items enriched with product info)
- int_order_items_enriched (order items with cost/margin data)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create customer-focused intermediate models:
- int_customer_order_history: aggregated order stats per customer
- int_customer_first_last_orders: first/last order dates and tenure
- int_customer_payment_methods: pivoted payment method totals
- int_customer_review_activity: review stats per customer
- int_customer_segments: RFM-based customer segmentation

All 15 intermediate models pass (PASS=15 ERROR=0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Store domain: int_store_employees_active, int_store_order_assignments,
int_store_revenue, int_store_performance
Supply chain: int_supply_order_costs, int_inventory_movements,
int_product_stock_levels (incremental)
Reviews: int_reviews_with_products, int_product_ratings
Analytics: int_promotion_effectiveness

All 25 intermediate models pass (PASS=25 WARN=0 ERROR=0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move mart-level models (customers, orders, new_orders) along with
schema.yml and docs.md into models/marts/. Update dbt_project.yml
to configure intermediate, marts, and metrics directory layers with
appropriate materialization and doc color settings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
danyelf and others added 6 commits February 23, 2026 21:56
Create mart-level models for customer analytics (CLV, segments, retention,
acquisition, 360 view, cohorts, review summary) and order analytics (items,
returns, discounts, fulfillment, payment status).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Product marts (6): products, product_performance, product_categories,
product_inventory, product_reviews, product_profitability
Store marts (5): stores, store_performance, store_staffing,
store_inventory, store_rankings
Finance marts (5): payments_fact, revenue_summary, promotion_roi,
cost_analysis, gross_margin
Supply chain marts (4): supply_orders_fact, supplier_lead_times,
reorder_recommendations, inventory_health

All 35 mart models pass dbt run (PASS=35 ERROR=0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 metric models (daily/weekly/monthly revenue, orders, customer acquisition/retention, product sales, store, promotion, inventory) and 5 report dashboard models (executive, sales, customer, product, store) for analytics layer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define model descriptions and data tests (unique, not_null) for all 25
intermediate models and 15 metrics/reporting models. All 15 tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@danyelf danyelf changed the title Expand jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.