Skip to content

DataRecce/bauplan-recce-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bauplan + Recce Demo: Instagram Engagement Segmentation

This repo demonstrates building a data pipeline with Bauplan and using Recce to compare pipeline changes across branches.

The demo walks through two stages of building an Instagram user engagement segmentation pipeline, each on its own branch:

Branch Description
main Setup: data ingestion script and lineage tooling
stage-1 Naive 4-segment engagement pipeline
stage-2 Adds bot detection as a 5th segment

Prerequisites

  • Bauplan CLI installed and configured
  • Python 3.11+
  • A Bauplan account with access to the shared lakehouse

Setup

1. Ingest the source data

The ingestion script imports Instagram engagement data from S3 into the Bauplan lakehouse using the Write-Audit-Publish (WAP) pattern:

python ingest_instagram_data.py

This creates the bauplan.instagram_engagement_data table (~1.5M rows, 58 columns) on an isolated branch, validates it's non-empty, and leaves the branch open for inspection. Follow the printed instructions to merge to main.

2. Build the pipeline

Use the prompt below (or check out stage-1) to build the first version of the pipeline:


Prompt: Stage 1

Let's build a pipeline that calculates different user segments by engagement level. We want to segment users into tiers based on their engagement metrics from the instagram_engagement_data table. Materialize all models.

Lineage Tools

The scripts/ directory contains tools for generating and validating column-level lineage metadata:

  • scripts/generate_lineage.py — Reads pipeline source and calls Claude to extract lineage JSON
  • scripts/validate_lineage.py — Validates lineage JSON structure and checks against live Bauplan schemas
  • prompts/lineage_prompt.md — The LLM prompt template used for lineage extraction

About

Bauplan + Recce demo: Instagram engagement segmentation pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages