Extra Big jaffle shop with 90+ models across staging, intermediate, marts, and metrics layers#53
Conversation
We wanted to move this project to the latest dbt-core version to ensure it operates on a version of dbt-core that has addressed the security issue (CVE-2024-22195) with Jinja2. By association we also had to upgrade the version of dbt-duckdb being used. Tangentially we also upgraded the version of sqlfluff.
…ncies The `requirements.txt` was regenerated by first deleting the existing `requirements.txt` and then running `$ pip-compile`.
…exclude-Jinja2-3.1.2-new Upgrade Jinja2 dependency version specification to address CVE-2024-22195
This reverts commit f51b08d.
Updating requirements.txt to support dbt Core version 1.8
Create CODEOWNERS file
Change SQLFluff dialect
* generate new requirements.txt, update readme to use duckdb CLI * Update version ranges and pinnned versions * Upgrade from Python 3.8 to Python 3.12 * Replace `duckcli` with `duckdb` CLI * Restore `duckcli` requirement --------- Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* simplify extension startup * strict validation of yaml
Updated action references from tags/branches to specific commit SHAs for improved security and reproducibility.
…44130921 Pin GitHub Actions to specific SHAs (10 actions in 1 files)
…po-link bugfix: updated overview repo link
Move to Python 3.13
Expands project from 9 to 99 dbt nodes across 5 layers for Recce demo/testing: 12 seeds, 12 staging, 25 intermediate, 35 mart, and 15 reporting models with diamond dependencies, incrementals, and interesting CTEs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create scripts/generate_seeds.py that generates referentially-valid seed data for categories, products, order items, stores, employees, promotions, order-promotions, supply orders, and reviews. Uses random.seed(42) for reproducibility and reads existing raw_orders.csv to ensure review customer_ids match order ownership. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create staging models for products, categories, order_items, stores, employees, promotions, order_promotions, supply_orders, and reviews. Update schema.yml with tests for new models. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create the intermediate layer with category hierarchy, product-category joins, margin calculations, and enriched order items. Fix stg_categories to reference parent_category_id directly since the seed already types it as INTEGER (empty CSV values become NULL). Models added: - int_category_hierarchy (recursive CTE for category tree) - int_products_with_categories (products joined with category hierarchy) - int_product_margins (margin and margin_pct calculations) - int_order_items_with_products (order items enriched with product info) - int_order_items_enriched (order items with cost/margin data) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create customer-focused intermediate models: - int_customer_order_history: aggregated order stats per customer - int_customer_first_last_orders: first/last order dates and tenure - int_customer_payment_methods: pivoted payment method totals - int_customer_review_activity: review stats per customer - int_customer_segments: RFM-based customer segmentation All 15 intermediate models pass (PASS=15 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Store domain: int_store_employees_active, int_store_order_assignments, int_store_revenue, int_store_performance Supply chain: int_supply_order_costs, int_inventory_movements, int_product_stock_levels (incremental) Reviews: int_reviews_with_products, int_product_ratings Analytics: int_promotion_effectiveness All 25 intermediate models pass (PASS=25 WARN=0 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move mart-level models (customers, orders, new_orders) along with schema.yml and docs.md into models/marts/. Update dbt_project.yml to configure intermediate, marts, and metrics directory layers with appropriate materialization and doc color settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create mart-level models for customer analytics (CLV, segments, retention, acquisition, 360 view, cohorts, review summary) and order analytics (items, returns, discounts, fulfillment, payment status). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Product marts (6): products, product_performance, product_categories, product_inventory, product_reviews, product_profitability Store marts (5): stores, store_performance, store_staffing, store_inventory, store_rankings Finance marts (5): payments_fact, revenue_summary, promotion_roi, cost_analysis, gross_margin Supply chain marts (4): supply_orders_fact, supplier_lead_times, reorder_recommendations, inventory_health All 35 mart models pass dbt run (PASS=35 ERROR=0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 metric models (daily/weekly/monthly revenue, orders, customer acquisition/retention, product sales, store, promotion, inventory) and 5 report dashboard models (executive, sales, customer, product, store) for analytics layer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define model descriptions and data tests (unique, not_null) for all 25 intermediate models and 15 metrics/reporting models. All 15 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Details
This expands the jaffle shop from ~10 models to ~99 nodes, covering product catalog, store operations, supply chain, customer analytics, promotions, reviews, and executive reporting domains. The layered architecture (staging → intermediate → marts → metrics) follows dbt best practices.
Test plan
dbt seedto load all seed datadbt runto build all modelsdbt testto validate schema tests🤖 Generated with Claude Code