Skip to content

feat: focused profiling — only profile changed columns#1

Open
affsantos wants to merge 1 commit intomainfrom
feat/focused-profiling
Open

feat: focused profiling — only profile changed columns#1
affsantos wants to merge 1 commit intomainfrom
feat/focused-profiling

Conversation

@affsantos
Copy link
Owner

Syncs focused profiling from the internal dbt repo. For wide models (100+ columns), profiles only columns that actually changed — ~90% reduction in BQ compute.

See finn-auto/dbt#7445 for the full context and test results.

For wide models (100+ columns), data-diff now profiles only the columns
that actually changed, using sqlglot AST analysis. This reduces BQ
compute by ~90% for typical changes to wide models.

How it works:
- parse_columns.py identifies added_columns + expression_changes
- is_cte_change_additive() classifies CTE modifications:
  - Additive (new LEFT JOINs, new columns) → safe for focused profiling
  - Structural (WHERE/filter/JOIN changes) → falls back to full profiling
- EXCEPT DISTINCT row comparison always covers ALL columns as safety net
- --full flag to override and profile everything

Also syncs upstream improvements:
- data-diff.sh: get_affected_columns() helper, column filter in
  build_profile_query(), profiling mode tracking per model
- template.html: focused profiling banner, skip-row styling for
  non-profiled columns (shown as — instead of misleading zeros)
- SKILL.md: document --full flag and focused profiling behavior
@affsantos affsantos force-pushed the feat/focused-profiling branch from e434d44 to ad34060 Compare March 18, 2026 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant