Skip to content

(Demo) Fix: customer lifetime value calculation in customers#1

Open
even-wei wants to merge 2 commits intomainfrom
fix/customer-lifetime-value
Open

(Demo) Fix: customer lifetime value calculation in customers#1
even-wei wants to merge 2 commits intomainfrom
fix/customer-lifetime-value

Conversation

@even-wei
Copy link
Collaborator

@even-wei even-wei commented Nov 27, 2023

👋 This is a demo pull request of dbt model changes, powered by Recce to provide proof of correctness during development and change review.
This is still early and we'd love your feedback!

Description & motivation

The customer lifetime value (CLV) in customers model incorrectly included orders that were not yet completed, leading to potentially inaccurate business insights.
I modified the CLV calculation within customers to consider only "completed" orders.

To-do before merge

Notify Stakeholders

  • Ask Sarah to review the customer segment impacts on marketing (see impact section below)

Lineage:

Lineage DAG Diff

image

Modified customers model to adjust customer_lifetime_value calculation.

 


Validation of models:

As expected, the customer lifetime value has changed. The average value of CLV has been reduced and this affects 98% of customers.

Query Diff: customers

image

The customer_lifetime_value has reduced for certain customers. This is expected given the change to the CLV calculation.

 


Profile Diff: customers

image

Min/max remains unchanged, average value has dropped from 2758.6 to 1871.76 as expected.

 


Value Diff: customers

image

Only 1.19% match on customer_lifetime_value.

 


Magic metric of customers

image

Our magic metric, average lifetime value, goes down as well.

SQL

SELECT
    DATE_TRUNC('week', first_order) AS first_order_week,
    AVG(customer_lifetime_value) AS avg_lifetime_value
FROM
    {{ ref("customers") }}
WHERE first_order is not NULL    
GROUP BY
    first_order_week
ORDER BY
    first_order_week;

Impact considerations:

The adjustment to CLV values has resulted in 26.45% change of customer ‘value segment’. This impact is also expected.

Value Diff: customer_segments

image

customer_lifetime_value as expected matches parent table at 1.19%.
value_segment is changed by 26.45%.

 


Query Diff: customer_segments

image

value_segment has changed for some of those customers with a decreased CLV. This is expected and I'll notify the business team.


Top-k diff: value_segment

image

We've seen a 17% drop in high-value customers. We need to discuss with stakeholders whether an adjustment to the threshold is necessary.

There is no change in customer_order_pattern.

Value diff: customer_order_pattern

image

There is no change.

Changes to existing models:

N/A

Checklist:

This checklist is mostly useful as a reminder of small things that can easily be
forgotten – it is meant as a helpful tool rather than hoops to jump through.
Put an x in all the items that apply, make notes next to any that haven't been
addressed, and remove any items that are not relevant to this PR.

  • My pull request represents one logical piece of work.
  • My commits are related to the pull request and look clean.
  • My SQL follows the style guide.
  • I have materialized my models appropriately.
  • I have added appropriate tests and documentation to any new models.
  • I have updated the README file.

@even-wei even-wei changed the title Correct customer lifetime value calculation in customers Fix: customer lifetime value calculation in customers Jan 16, 2024
@clkao
Copy link

clkao commented Jan 16, 2024

Should we also check customer_order_pattern?

@clkao clkao changed the title Fix: customer lifetime value calculation in customers (Demo) Fix: customer lifetime value calculation in customers Jan 16, 2024
@even-wei
Copy link
Collaborator Author

There is no change in customer_order_pattern.
I will add it and update the checklist.
image

@even-wei
Copy link
Collaborator Author

Since the sources have been updated in the production environment, I've rebased the PR, rerun the checks, and updated the body. Most of the checks yield the same results, but I noticed that 99% of the value segments are classified as 'high value' (refer to the Top-k diff above). I plan to review the threshold next week.

@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch 2 times, most recently from eedae39 to 94036fd Compare April 9, 2024 08:20
@even-wei even-wei force-pushed the fix/customer-lifetime-value branch from 94036fd to 183171a Compare April 22, 2024 03:46
@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch from 183171a to f35cbd1 Compare April 26, 2024 09:01
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch from f35cbd1 to 231f592 Compare May 14, 2024 02:40
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@DataRecce DataRecce deleted a comment from github-actions bot May 14, 2024
@github-actions

This comment was marked as outdated.

@popcornylu popcornylu force-pushed the fix/customer-lifetime-value branch from 231f592 to 4080960 Compare May 28, 2024 06:34
@github-actions

This comment was marked as outdated.

@popcornylu popcornylu force-pushed the fix/customer-lifetime-value branch from 4080960 to 01a9ef1 Compare May 30, 2024 09:47
@github-actions

This comment was marked as outdated.

@popcornylu popcornylu force-pushed the fix/customer-lifetime-value branch from 01a9ef1 to 2ff3a39 Compare June 5, 2024 08:19
@github-actions

This comment was marked as outdated.

@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch 2 times, most recently from 0c99b5a to 7ca327e Compare June 6, 2024 18:30
@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch 2 times, most recently from e9fbba7 to 383a007 Compare June 20, 2024 03:43
@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch from 383a007 to 398ac78 Compare July 4, 2024 10:12
@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch from 398ac78 to 75a349b Compare July 5, 2024 07:31
@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch 2 times, most recently from 75a349b to d5fe9c4 Compare August 13, 2024 07:40
@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch from d5fe9c4 to 73dce86 Compare September 20, 2024 09:04
@even-wei even-wei force-pushed the fix/customer-lifetime-value branch from 73dce86 to 4526216 Compare October 21, 2024 02:28
@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch from 4526216 to b8709ec Compare October 24, 2024 03:37
@even-wei even-wei force-pushed the fix/customer-lifetime-value branch 2 times, most recently from 083a55b to 8337818 Compare October 29, 2024 05:05
@kentwelcome kentwelcome force-pushed the fix/customer-lifetime-value branch 6 times, most recently from 58feedb to 5ec10dd Compare November 5, 2024 07:48
@even-wei even-wei mentioned this pull request Jan 10, 2025
@wcchang1115 wcchang1115 force-pushed the fix/customer-lifetime-value branch from 5ec10dd to 637a046 Compare February 20, 2025 03:52
@popcornylu

This comment was marked as duplicate.

@recce-cloud
Copy link

recce-cloud bot commented Feb 28, 2026

Summary

PR #1 modifies the customers model to fix the customer lifetime value (CLV) calculation to include only completed orders. The change is isolated to a single file (models/customers.sql) with minimal syntax changes, but results in a 32.1% reduction in average customer lifetime values across the dataset. Lineage analysis reveals 2 downstream models (customer_order_pattern and customer_segments) will be impacted by the recalculated values.


Key Changes

  • Model Modified: customers table

    • File: models/customers.sql (1 addition, 0 deletions)
    • Change: Fixed CLV calculation to exclude non-completed orders
    • Row count impact: Stable at 1,856 records (no additions/deletions)
  • Customer Lifetime Value Recalculation:

    • Profile diff shows average CLV reduced from 2,758.60 → 1,871.77 (-32.1%)
    • Max CLV dropped from 10,092 → 6,852 (-32.1%)
    • Median CLV decreased from 2,126.50 → 1,451.00 (-31.8%)
    • Value diff indicates 98.81% mismatch (1,834 out of 1,856 records changed)
    • 5 new NULL values introduced in CLV column (0.27% of dataset)

Impact Analysis

Lineage Analysis reveals the customers model is the change epicenter with 2 downstream dependencies:

graph LR
    stg_customers["stg_customers<br/>(view)"]:::unchanged
    stg_orders["stg_orders<br/>(view)"]:::unchanged
    stg_payments["stg_payments<br/>(view)"]:::unchanged
    customers["customers<br/>(table)"]:::modified
    customer_order_pattern["customer_order_pattern<br/>(table)"]:::impacted
    customer_segments["customer_segments<br/>(table)"]:::impacted

    stg_customers --> customers
    stg_orders --> customers
    stg_payments --> customers
    customers --> customer_order_pattern
    customers --> customer_segments

    classDef added fill:#d4edda,stroke:#28a745,color:#000000
    classDef removed fill:#f8d7da,stroke:#dc3545,color:#000000
    classDef modified fill:#fff3cd,stroke:#ffc107,color:#000000
    classDef impacted fill:#ffffff,stroke:#ffc107,color:#000000
    classDef unchanged fill:#ffffff,stroke:#d3d3d3,color:#999999
Loading
  • 🟨 Modified Model: customers - core change point for CLV logic
  • 🟨 Impacted Downstream: customer_order_pattern and customer_segments inherit the recalculated CLV values
  • Upstream Dependencies Unchanged: stg_customers, stg_orders, stg_payments remain stable

☑️ Checklist

Name Run Status Impact Analysis
Row Count Diff PASSED ✅ Row counts stable at 1,856 (customers) and 280,844 (orders) - no data loss detected
Schema Diff PASSED ✅ All 7 columns in customers unchanged - no new/removed/modified columns
Query Diff - Customers AVG Lifetime Value ⚠️ FAILED ⚠️ 30-33% reduction across all weekly cohorts - expected due to filtering completed orders only
Value Diff - Customer Lifetime Value 🔴 CRITICAL 🔴 98.81% mismatch rate (1,834 of 1,856 records differ) - fundamental recalculation confirmed; 5 new NULL values introduced

🔍 Suggested Actions

  • Validate CLV calculation logic: Review the SQL modification in models/customers.sql to confirm the "completed orders only" filter is correctly implemented and matches business requirements.
  • Investigate NULL values in customer_lifetime_value: Run query_diff on the 5 customers with new NULL CLV values to understand why the calculation failed for these specific records.
  • Verify downstream impacts: Confirm that customer_order_pattern and customer_segments produce expected results with the ~32% reduction in CLV values, particularly for business metrics and reports that depend on these models.
  • Review data quality in source: Run profile_diff on stg_orders to validate that the "completed" order status filtering logic correctly reflects the order data quality and completeness.
  • Test affected reports: Check dependent dashboards and reports using customer_order_pattern and customer_segments to ensure they display correct insights with the recalculated lifetime values.
    Please use the link below to launch your Recce Cloud session.

Launch Recce Cloud Session


Was this summary helpful? 👍 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants