Skip to content

selimsevim/SFMC_NBA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Building a Next Best Action Engine for SFMC Using DigitalOcean Droplets (via SFTP Integration)

A production-ready, open-source Next Best Action (NBA) engine for Salesforce Marketing Cloud (SFMC).
It ingests daily CSV exports (via SFTP file drop), learns/updates lightweight models, and returns per-contact action scores + recommendations—no APIs, no middleware, no black-box magic.

  • Stack: Python, LightGBM, Pandas, Paramiko

  • Infra: DigitalOcean Droplet (or any VM), SFMC SFTP

  • I/O: CSV in/out (Export/sfmc_nbaImport/sfmc_nba)

  • Mode: Auto-train on first run, then score daily (cron)

Why this exists

Years of “manual uploads + static journeys” inspired a simple, transparent NBA loop:

  1. SFMC exports 3 DEs (CustomerActivity, EmailEngagement, ProductCatalog)

  2. Engine trains/scores on a droplet (auto-train if no model yet)

  3. Engine returns 4 files to SFMC (affinity, engagement, action scores, recommendations)

  4. Journeys/Decision Splits use the latest, data-driven “next best” for each contact

File drop > APIs for this use case: predictable, auditable, cost-efficient at high volume.

Key features

  • Auto-train on first run (bootstraps a send log from the 3 DEs if you don’t have one)

  • Per-family models (PRODUCT / CATEGORY / etc.) with fallback to calibrated priors

  • “StrengthScore” blends predicted CTR with uncertainty and your Aggressive/Chill tone

  • Self-healing loop: fetch → (train if needed) → score → upload → repeat

  • Fully transparent Python code—tune weights, rules, thresholds anytime

Files

├─ run_job.py              # Orchestrator: SFTP in/out, auto-train, scoring, uploads
├─ nba_pipeline.py         # Data prep, candidate gen, modeling (LightGBM), outputs
├─ actions_catalog.csv     # (sample) Action inventory: ActionID, Channel, ActionType, TopicType
├─ in/                     # Local scratch for downloaded CSVs
├─ out/                    # Engine outputs written here before SFTP upload
├─ artifacts/              # Saved models / priors (auto-created)
└─ README.md               # This file

Expected SFMC data extensions (input)

CustomerActivity_DE.csv

SubscriberKey, ClickDate, PurchaseDate, ProductID, ProductName, Category, Quantity, PurchaseAmount, SpentTotal, Source

Source: Imported from Google Analytics, Adobe Analytics, or any custom web tracking export via SFTP.

EmailEngagement_DE.csv

SubscriberKey, EmailName, SubjectLine, SendDate, OpenDate, ClickDate, Opened, Clicked, URLClicked, CampaignName, Device, Source

Source: Pulled directly from SFMC tracking data using Data Views or Query Activities.

ProductCatalog_DE.csv

ProductID, ProductName, Category, Brand, Description, Price, Currency, ImageURL, ProductURL, InStock, Tags, LastUpdated, Source

Source: Synchronized from your website’s API, e-commerce product feed, or imported manually into SFMC on a scheduled basis.

Expected SFMC data extensions (output)

ContactAffinity_DE.csv

SubscriberKey, Category, ProductID, AffinityScore, IntentType, LastIntentAt, PriceBand, Confidence

Source: Generated by the NBA engine using time-decayed signals from CustomerActivity_DE joined with ProductCatalog_DE.

ContactEngagement_DE.csv

SubscriberKey, LastOpenAt, LastClickAt, EmailsSent30d, Opens30d, Clicks30d, OpenRate30d, ClickRate30d, EngagementStage, SendCapToday, PreferredDevice

Source: Generated by the NBA engine from EmailEngagement_DE (30-day rollups + stage logic).

NBA_ActionScores_DE.csv

SubscriberKey, ActionID, ActionType, Channel, TopicType, TopicKey, ProductID, Category, PredictedProb, Lower, Upper, StrengthScore, StrengthBucket, ComputedAtUTC, ExpiryAtUTC

Source: Generated by the NBA engine (LightGBM ensemble or priors fallback) scoring per-subscriber action candidates.

NBA_Recommendations_DE.csv

SubscriberKey, ActionType, PrimaryProductID, BackupProductID, PrimaryCategory, CreativeTheme, ReasonCode, StrengthScore, PredictedProb, Lower, Upper, ExpiryAt, LastUpdatedAt

Source: Generated by the NBA engine from NBA_ActionScores_DE (top-N selection + human-friendly fields for journeys).

Actions catalog (reference)

actions_catalog.csv (minimum required):

Column Example Notes
ActionID PRODUCT, CATEGORY, cart, reengage Used as “family” for modeling/fallback
Channel Email Free text (Email/SMS/etc.)
ActionType Journey, Campaign Your taxonomy (Email/SMS/etc.)
TopicType Product / Category / Product_or_Category / NONE Drives candidate generation

Special ActionID handling: cart, price_drop, back_in, reengage, loyalty, holdout (graceful no-op if needed inputs aren’t present).

Quick start

1) Install (Python 3.10+ recommended)

pip install -r requirements.txt
# If you don't ship a requirements.txt, install the essentials:
pip install pandas numpy lightgbm scikit-learn paramiko

2) Configure environment

Set SFTP + paths (env vars or a .env you export before running):

SFTP_HOST=*****.ftp.marketingcloudops.com
SFTP_PORT=22
SFTP_USER=*****
SFTP_PASSWORD=*****
SFTP_INBOUND_PATH=Export/sfmc_nba
SFTP_OUTBOUND_PATH=Import/sfmc_nba
EMAIL_STEM=EmailEngagement
CUST_STEM=CustomerActivity
PROD_STEM=ProductCatalog
IN_DIR=/opt/nba/in
OUT_DIR=/opt/nba/out

3) First run (auto-train if no models)

`python run_job.py`

What happens:

  • Downloads the latest YYYY-MM-DD-EmailEngagement.csv, CustomerActivity, ProductCatalog

  • Pulls actions_catalog.csv (local or remote)

  • Auto-trains models if artifacts/ is empty (bootstraps a send log when needed)

  • Scores and writes in out/:

    • ContactAffinity_DE.csv

    • ContactEngagement_DE.csv

    • NBA_ActionScores_DE.csv

    • NBA_Recommendations_DE.csv

  • Uploads all outputs to Import/sfmc_nba on SFMC SFTP

Automation (cron on the droplet)

Run daily at 06:00 UTC:

`crontab -e # ───────────────────────────────────────── 0 6 * * * cd /opt/sfmc-nba && /usr/bin/env -S bash -lc 'source ~/.profile && python run_job.py >> logs/cron.log 2>&1'`

How the model chooses

  • Affinity: time-decayed clicks/purchases → category/product confidence

  • Engagement: recent open/click rates + stage (New, Active, Cooling, Dormant)

  • Candidates: built from your actions_catalog per TopicType

  • Score: LightGBM ensemble per ActionID (or Bayesian priors fallback)

  • StrengthScore (0–100): blends predicted prob + uncertainty + tone

  • Top-N per subscriber returned for easy plugging into Journeys

SFMC wiring tips

  • Point an Automation Studio workflow to export the 3 DEs daily to Export/sfmc_nba

  • Create 4 Import activities for the returning CSVs into:

    • ContactAffinity_DE

    • ContactEngagement_DE

    • NBA_ActionScores_DE

    • NBA_Recommendations_DE

  • Use Decision Splits / Data Views / AMPscript to pick PrimaryProductID/PrimaryCategory and creative theme


Configuration knobs (in nba_pipeline.py)

  • Decay/weights: half_life_days, w_click, w_purchase

  • Engagement stages: welcome_new_days, cooling_no_click_days, dormant_no_engage_days

  • Output: nba_top_n, nba_expiry_hours

  • Tone handling: tone_alpha and strength thresholds


Security & governance

  • SFTP key auth preferred; restrict to required folders

  • No PII beyond keys needed for joins (SubscriberKey)

  • Align with your internal data retention and DPA requirements


Troubleshooting

  • “No files matching ‘EmailEngagement’”: ensure filenames follow YYYY-MM-DD-<Stem>.csv and env stems match (EMAIL_STEM, CUST_STEM, PROD_STEM).

  • Auto-train failed: product URLs missing or don’t map to ProductID—add ProductURL or tweak bootstrap logic.

  • Empty candidates: check actions_catalog.csv TopicType values and that affinity isn’t empty.


License

MIT (feel free to use/extend internally or with clients).


Citation / blog

This repo accompanies the LinkedIn article “Building a Next Best Action Engine for SFMC Using DigitalOcean Droplets (via SFTP Integration)”. https://www.linkedin.com/pulse/building-next-best-action-engine-sfmc-using-droplets-via-selim-sevim-pr0gf/

About

AI-driven Next Best Action engine for SFMC — simple, transparent, and SFTP-based.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages