A production-ready, open-source Next Best Action (NBA) engine for Salesforce Marketing Cloud (SFMC).
It ingests daily CSV exports (via SFTP file drop), learns/updates lightweight models, and returns per-contact action scores + recommendations—no APIs, no middleware, no black-box magic.
-
Stack: Python, LightGBM, Pandas, Paramiko
-
Infra: DigitalOcean Droplet (or any VM), SFMC SFTP
-
I/O: CSV in/out (
Export/sfmc_nba→Import/sfmc_nba) -
Mode: Auto-train on first run, then score daily (cron)
Years of “manual uploads + static journeys” inspired a simple, transparent NBA loop:
-
SFMC exports 3 DEs (CustomerActivity, EmailEngagement, ProductCatalog)
-
Engine trains/scores on a droplet (auto-train if no model yet)
-
Engine returns 4 files to SFMC (affinity, engagement, action scores, recommendations)
-
Journeys/Decision Splits use the latest, data-driven “next best” for each contact
File drop > APIs for this use case: predictable, auditable, cost-efficient at high volume.
-
Auto-train on first run (bootstraps a send log from the 3 DEs if you don’t have one)
-
Per-family models (PRODUCT / CATEGORY / etc.) with fallback to calibrated priors
-
“StrengthScore” blends predicted CTR with uncertainty and your Aggressive/Chill tone
-
Self-healing loop: fetch → (train if needed) → score → upload → repeat
-
Fully transparent Python code—tune weights, rules, thresholds anytime
├─ run_job.py # Orchestrator: SFTP in/out, auto-train, scoring, uploads
├─ nba_pipeline.py # Data prep, candidate gen, modeling (LightGBM), outputs
├─ actions_catalog.csv # (sample) Action inventory: ActionID, Channel, ActionType, TopicType
├─ in/ # Local scratch for downloaded CSVs
├─ out/ # Engine outputs written here before SFTP upload
├─ artifacts/ # Saved models / priors (auto-created)
└─ README.md # This file
SubscriberKey, ClickDate, PurchaseDate, ProductID, ProductName, Category, Quantity, PurchaseAmount, SpentTotal, Source
Source: Imported from Google Analytics, Adobe Analytics, or any custom web tracking export via SFTP.
SubscriberKey, EmailName, SubjectLine, SendDate, OpenDate, ClickDate, Opened, Clicked, URLClicked, CampaignName, Device, Source
Source: Pulled directly from SFMC tracking data using Data Views or Query Activities.
ProductID, ProductName, Category, Brand, Description, Price, Currency, ImageURL, ProductURL, InStock, Tags, LastUpdated, Source
Source: Synchronized from your website’s API, e-commerce product feed, or imported manually into SFMC on a scheduled basis.
SubscriberKey, Category, ProductID, AffinityScore, IntentType, LastIntentAt, PriceBand, Confidence
Source: Generated by the NBA engine using time-decayed signals from CustomerActivity_DE joined with ProductCatalog_DE.
SubscriberKey, LastOpenAt, LastClickAt, EmailsSent30d, Opens30d, Clicks30d, OpenRate30d, ClickRate30d, EngagementStage, SendCapToday, PreferredDevice
Source: Generated by the NBA engine from EmailEngagement_DE (30-day rollups + stage logic).
SubscriberKey, ActionID, ActionType, Channel, TopicType, TopicKey, ProductID, Category, PredictedProb, Lower, Upper, StrengthScore, StrengthBucket, ComputedAtUTC, ExpiryAtUTC
Source: Generated by the NBA engine (LightGBM ensemble or priors fallback) scoring per-subscriber action candidates.
SubscriberKey, ActionType, PrimaryProductID, BackupProductID, PrimaryCategory, CreativeTheme, ReasonCode, StrengthScore, PredictedProb, Lower, Upper, ExpiryAt, LastUpdatedAt
Source: Generated by the NBA engine from NBA_ActionScores_DE (top-N selection + human-friendly fields for journeys).
actions_catalog.csv (minimum required):
| Column | Example | Notes |
|---|---|---|
| ActionID | PRODUCT, CATEGORY, cart, reengage | Used as “family” for modeling/fallback |
| Channel | Free text (Email/SMS/etc.) | |
| ActionType | Journey, Campaign | Your taxonomy (Email/SMS/etc.) |
| TopicType | Product / Category / Product_or_Category / NONE | Drives candidate generation |
Special ActionID handling: cart, price_drop, back_in, reengage, loyalty, holdout (graceful no-op if needed inputs aren’t present).
pip install -r requirements.txt
# If you don't ship a requirements.txt, install the essentials:
pip install pandas numpy lightgbm scikit-learn paramiko
Set SFTP + paths (env vars or a .env you export before running):
SFTP_HOST=*****.ftp.marketingcloudops.com
SFTP_PORT=22
SFTP_USER=*****
SFTP_PASSWORD=*****
SFTP_INBOUND_PATH=Export/sfmc_nba
SFTP_OUTBOUND_PATH=Import/sfmc_nba
EMAIL_STEM=EmailEngagement
CUST_STEM=CustomerActivity
PROD_STEM=ProductCatalog
IN_DIR=/opt/nba/in
OUT_DIR=/opt/nba/out
`python run_job.py`
What happens:
-
Downloads the latest
YYYY-MM-DD-EmailEngagement.csv, CustomerActivity, ProductCatalog -
Pulls
actions_catalog.csv(local or remote) -
Auto-trains models if
artifacts/is empty (bootstraps a send log when needed) -
Scores and writes in
out/:-
ContactAffinity_DE.csv -
ContactEngagement_DE.csv -
NBA_ActionScores_DE.csv -
NBA_Recommendations_DE.csv
-
-
Uploads all outputs to
Import/sfmc_nbaon SFMC SFTP
Run daily at 06:00 UTC:
`crontab -e # ───────────────────────────────────────── 0 6 * * * cd /opt/sfmc-nba && /usr/bin/env -S bash -lc 'source ~/.profile && python run_job.py >> logs/cron.log 2>&1'`
-
Affinity: time-decayed clicks/purchases → category/product confidence
-
Engagement: recent open/click rates + stage (New, Active, Cooling, Dormant)
-
Candidates: built from your actions_catalog per TopicType
-
Score: LightGBM ensemble per ActionID (or Bayesian priors fallback)
-
StrengthScore (0–100): blends predicted prob + uncertainty + tone
-
Top-N per subscriber returned for easy plugging into Journeys
-
Point an Automation Studio workflow to export the 3 DEs daily to
Export/sfmc_nba -
Create 4 Import activities for the returning CSVs into:
-
ContactAffinity_DE -
ContactEngagement_DE -
NBA_ActionScores_DE -
NBA_Recommendations_DE
-
-
Use Decision Splits / Data Views / AMPscript to pick
PrimaryProductID/PrimaryCategoryand creative theme
-
Decay/weights:
half_life_days,w_click,w_purchase -
Engagement stages:
welcome_new_days,cooling_no_click_days,dormant_no_engage_days -
Output:
nba_top_n,nba_expiry_hours -
Tone handling:
tone_alphaand strength thresholds
-
SFTP key auth preferred; restrict to required folders
-
No PII beyond keys needed for joins (
SubscriberKey) -
Align with your internal data retention and DPA requirements
-
“No files matching ‘EmailEngagement’”: ensure filenames follow
YYYY-MM-DD-<Stem>.csvand env stems match (EMAIL_STEM,CUST_STEM,PROD_STEM). -
Auto-train failed: product URLs missing or don’t map to
ProductID—addProductURLor tweak bootstrap logic. -
Empty candidates: check
actions_catalog.csvTopicType values and that affinity isn’t empty.
MIT (feel free to use/extend internally or with clients).
This repo accompanies the LinkedIn article “Building a Next Best Action Engine for SFMC Using DigitalOcean Droplets (via SFTP Integration)”. https://www.linkedin.com/pulse/building-next-best-action-engine-sfmc-using-droplets-via-selim-sevim-pr0gf/