Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers by danyelf · Pull Request #53 · DataRecce/jaffle_shop_duckdb

danyelf · 2026-03-16T22:49:34Z

Summary

Adds 9 new seed datasets (categories, employees, order items, products, promotions, reviews, stores, supply orders, order promotions) with a Python generation script
Adds 9 staging models, 25 intermediate models, 32 mart models, and 15 metrics/reporting models organized in a layered architecture
Includes schema.yml documentation for intermediate and metrics layers
Adds recce.yml with preset checks for validating model changes

Details

This expands the jaffle shop from ~10 models to ~99 nodes, covering product catalog, store operations, supply chain, customer analytics, promotions, reviews, and executive reporting domains. The layered architecture (staging → intermediate → marts → metrics) follows dbt best practices.

Test plan

Run dbt seed to load all seed data
Run dbt run to build all models
Run dbt test to validate schema tests
Verify model counts match expectations (~99 nodes)

🤖 Generated with Claude Code

We wanted to move this project to the latest dbt-core version to ensure it operates on a version of dbt-core that has addressed the security issue (CVE-2024-22195) with Jinja2. By association we also had to upgrade the version of dbt-duckdb being used. Tangentially we also upgraded the version of sqlfluff.

…ncies The `requirements.txt` was regenerated by first deleting the existing `requirements.txt` and then running `$ pip-compile`.

…exclude-Jinja2-3.1.2-new Upgrade Jinja2 dependency version specification to address CVE-2024-22195

This reverts commit f51b08d.

Updating requirements.txt to support dbt Core version 1.8

Create CODEOWNERS file

Change SQLFluff dialect

* generate new requirements.txt, update readme to use duckdb CLI * Update version ranges and pinnned versions * Upgrade from Python 3.8 to Python 3.12 * Replace `duckcli` with `duckdb` CLI * Restore `duckcli` requirement --------- Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>

* simplify extension startup * strict validation of yaml

Updated action references from tags/branches to specific commit SHAs for improved security and reproducibility.

…44130921 Pin GitHub Actions to specific SHAs (10 actions in 1 files)

…po-link bugfix: updated overview repo link

Move to Python 3.13

Expands project from 9 to 99 dbt nodes across 5 layers for Recce demo/testing: 12 seeds, 12 staging, 25 intermediate, 35 mart, and 15 reporting models with diamond dependencies, incrementals, and interesting CTEs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create scripts/generate_seeds.py that generates referentially-valid seed data for categories, products, order items, stores, employees, promotions, order-promotions, supply orders, and reviews. Uses random.seed(42) for reproducibility and reads existing raw_orders.csv to ensure review customer_ids match order ownership. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create staging models for products, categories, order_items, stores, employees, promotions, order_promotions, supply_orders, and reviews. Update schema.yml with tests for new models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create the intermediate layer with category hierarchy, product-category joins, margin calculations, and enriched order items. Fix stg_categories to reference parent_category_id directly since the seed already types it as INTEGER (empty CSV values become NULL). Models added: - int_category_hierarchy (recursive CTE for category tree) - int_products_with_categories (products joined with category hierarchy) - int_product_margins (margin and margin_pct calculations) - int_order_items_with_products (order items enriched with product info) - int_order_items_enriched (order items with cost/margin data) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create customer-focused intermediate models: - int_customer_order_history: aggregated order stats per customer - int_customer_first_last_orders: first/last order dates and tenure - int_customer_payment_methods: pivoted payment method totals - int_customer_review_activity: review stats per customer - int_customer_segments: RFM-based customer segmentation All 15 intermediate models pass (PASS=15 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Store domain: int_store_employees_active, int_store_order_assignments, int_store_revenue, int_store_performance Supply chain: int_supply_order_costs, int_inventory_movements, int_product_stock_levels (incremental) Reviews: int_reviews_with_products, int_product_ratings Analytics: int_promotion_effectiveness All 25 intermediate models pass (PASS=25 WARN=0 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move mart-level models (customers, orders, new_orders) along with schema.yml and docs.md into models/marts/. Update dbt_project.yml to configure intermediate, marts, and metrics directory layers with appropriate materialization and doc color settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Create mart-level models for customer analytics (CLV, segments, retention, acquisition, 360 view, cohorts, review summary) and order analytics (items, returns, discounts, fulfillment, payment status). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Product marts (6): products, product_performance, product_categories, product_inventory, product_reviews, product_profitability Store marts (5): stores, store_performance, store_staffing, store_inventory, store_rankings Finance marts (5): payments_fact, revenue_summary, promotion_roi, cost_analysis, gross_margin Supply chain marts (4): supply_orders_fact, supplier_lead_times, reorder_recommendations, inventory_health All 35 mart models pass dbt run (PASS=35 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

10 metric models (daily/weekly/monthly revenue, orders, customer acquisition/retention, product sales, store, promotion, inventory) and 5 report dashboard models (executive, sales, customer, product, store) for analytics layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Define model descriptions and data tests (unique, not_null) for all 25 intermediate models and 15 metrics/reporting models. All 15 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

QMalcolm and others added 30 commits March 11, 2024 11:55

Regenerate requirements.txt to use dbt-core 1.7 and related depende…

479a5d5

…ncies The `requirements.txt` was regenerated by first deleting the existing `requirements.txt` and then running `$ pip-compile`.

Merge pull request dbt-labs#55 from dbt-labs/qmalcolm--CVE-2024-22195-…

9b13827

…exclude-Jinja2-3.1.2-new Upgrade Jinja2 dependency version specification to address CVE-2024-22195

Finish implementing initial support for DuckDB

f51b08d

Revert "Finish implementing initial support for DuckDB"

9746b8b

This reverts commit f51b08d.

Updating requirements.txt

46d939b

Update requirements.txt

d860542

Update requirements.txt

a2a5628

Merge pull request dbt-labs#62 from dbt-labs/requirements

026b12f

Updating requirements.txt to support dbt Core version 1.8

Create CODEOWNERS file with global codeowner

2a1db9e

Merge pull request dbt-labs#73 from dbt-labs/codeowners-create

85046ad

Create CODEOWNERS file

Change SQLFluff dialect

74c9480

Merge pull request dbt-labs#80 from esadek/dialect

682b356

Change SQLFluff dialect

Add DuckDB UI to Readme (dbt-labs#81)

80447e4

simplify extension startup (dbt-labs#83)

def2ee2

* simplify extension startup * strict validation of yaml

Pin GitHub Actions to specific SHAs (10 actions in 1 files)

10a96ce

Updated action references from tags/branches to specific commit SHAs for improved security and reproducibility.

Merge pull request dbt-labs#85 from dbt-labs/pin-github-actions-17619…

8d715f3

…44130921 Pin GitHub Actions to specific SHAs (10 actions in 1 files)

fixed bad link

d1437a1

Merge pull request dbt-labs#87 from walter9388/bugfix-bad-overview-re…

db6bffa

…po-link bugfix: updated overview repo link

Move to Python 3.13

f5e92e3

Merge pull request dbt-labs#89 from pgoslatara/python-3.13

3890477

Move to Python 3.13

Add 5 intermediate order domain models

b0b565f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

danyelf and others added 6 commits February 23, 2026 21:56

Add schema.yml files for intermediate and metrics layers

0e0385f

Define model descriptions and data tests (unique, not_null) for all 25 intermediate models and 15 metrics/reporting models. All 15 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

brave new world

de50d60

Update seed data

71be58f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

danyelf changed the title ~~Expand jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers~~ Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers#53

Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers#53
danyelf wants to merge 36 commits intoduckdbfrom
mega

danyelf commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Conversation

danyelf commented Mar 16, 2026

Summary

Details

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants