Skip to content

A robust ETL pipeline using n8n and Python to sanitize raw JSON data, demonstrating hybrid low-code orchestration with custom script-based data transformation.

Notifications You must be signed in to change notification settings

MAhsaanUllah/n8n-python-data-etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🐍 Automated Data Cleaning Pipeline (n8n + Python)

n8n Python

πŸ“‹ Project Overview

This project demonstrates a Hybrid Automation Workflow that combines the orchestration capabilities of n8n with the raw data processing power of Python. It mimics a real-world ETL (Extract, Transform, Load) process where unstructured or "dirty" data is ingested, sanitized using a custom Python script, and prepared for downstream analytics.

πŸ› οΈ The Problem

Raw data from webhooks or APIs often contains inconsistencies:

  • Inconsistent casing (e.g., "ahsaan", "AHSAAN").
  • Formatting issues (e.g., Currency symbols in numerical fields).
  • Data type mismatches (Strings instead of Integers).

πŸ’‘ The Solution

A custom Python Code Node within n8n handles the transformation logic:

  1. Ingestion: Workflow receives JSON payload.
  2. Transformation (Python):
    • Standardizes names to Title Case.
    • Normalizes emails to Lowercase.
    • Parses currency strings ($500) into Integers (500).
  3. Output: Returns clean, structured JSON ready for database insertion.

πŸ’» The Code Logic

# Sample of the transformation logic used in the node
if 'price' in json_data:
    # Remove $ and convert to integer for calculation
    clean_price = json_data['price'].replace('$', '')
    json_data['numeric_price'] = int(clean_price)

About

A robust ETL pipeline using n8n and Python to sanitize raw JSON data, demonstrating hybrid low-code orchestration with custom script-based data transformation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages