🧠 Agentic AI in Retail, Powered by MonkDB

Monk-RET (Monk Retail Insights Engine) is an intelligent retail analytics platform powered by MonkDB, MonkDB's MCP LangChain, Streamlit, and modern AI/ML pipelines.
It helps businesses gain actionable insights from large-scale retail data by orchestrating data ingestion, processing, and visualization seamlessly.

✨ Features

📊 Retail Analytics Engine – Ingests and processes large-scale retail datasets into MonkDB.
🧩 LangChain Orchestrator – Modular orchestration of tasks with LLMs
⚡ Batch Data Processing – Automated CSV ingestion & database syncing
📈 Interactive Dashboards – Streamlit-based UI for analytics & insights
🔄 Automation – Watchdog-powered auto-refresh for new datasets

🚀 Getting Started

1️⃣ Clone the repo

git clone https://github.com/monkdbofficial/demo.retailagent.git
cd demo.retailagent

2️⃣ Install dependencies

pip install -r requirements.txt

3️⃣ Run the Watchdong

python3 watchdog_.py

4️⃣ Move `_sample_products.csv` to `csv_folder/`

This shall trigger the watch and downstream agent orchestration logic which do the following in phases:

Chunk the data to 5000 records, and leverage dask to process the records before publishing to MonkDB tables.
Generate a streamlit dashboard app with insights based on MonkDB's sql queries.
The streamlit dashboard is deployed to its destination that has the dashboard, charts, and metrics in those charts.

Data Flow

As highlighted in the data flow diagram, watchdog triggers agents execution.

Upload agent- It uploads the processed data to MonkDB.
Generate Insights agent- It generates the insights by querying the database using MonkDB's SQL via MCP in agent's tool interface. It is used to generate the dashboard pack.
Deploy agent- This agent deploys the pack to destination.

🛠️ Tech Stack

Languages: Python
Database: MonkDB and its python sdk
Frameworks: MonkDB's MCP, LangChain
Data: Dask
DevOps: Watchdog
Visualization: Plotly, Streamlit

Demo

Pre-requisites

Install and provision MonkDB as per its documentation.
Provision MonkDB's user using PSQL as highlighted in MonkDB's documentation.
In this repo, update config.ini file located in config folder. Please ensure the IP address of DB_HOST variable is updated. It denotes the instance where MonkDB is installed.
Provision an LLM model. We are using Mistral via Ollama (ollama run mistral).

[database]
DB_HOST = xx.xx.xx.xxx
DB_PORT = 4200
DB_USER = testuser
DB_PASSWORD = testpassword
DB_SCHEMA = trent
TABLE_NAME = products

Also, ensure .env is updated in the root of this repo with the correct IP address of MonkDB's host.

MONKDB_HOST=xx.xx.xx.xxx
MONKDB_PORT=4200
MONKDB_USER=testuser
MONKDB_PASSWORD=testpassword
MONKDB_SCHEMA=trent
MONKDB_API_PORT=4200

# Optional OTEL configuration which can be enabled or disabled.
MONKDB_OTEL_ENABLED=false

As highlighted before, please create a virtual env and activate it before install requirements using pip.

📊 Example Workflow

Drop a new retail CSV into the /csv_folder folder
watchdog_.py detects and inserts data → DB
langchain_orch.py & gen_insights_force.py generate AI-powered insights
Open streamlit_app.py → interactive analytics dashboard

Performance Benchmark Test

Run this below command to execute performance testing.

python3 monkdb_pipeline_testrunner.py --csv datasets/_sample_products.csv --table trent.products --where "1=1" --parity-sample 200 --perf-repeats 20 --out-json reports/report.json --out-md reports/report.md

Argument	Purpose	Example Value
--csv	Path to the source CSV file used as the “gold” reference for accuracy checks. The script computes KPIs (row counts, averages, discount bands, etc.) on this file and compares them to MonkDB query results.	datasets/_sample_products.csv
--table	Fully-qualified database table name to query inside MonkDB. The test runner runs SELECTs on this table and compares them to the CSV-derived KPIs.	trent.products
--where	Optional SQL WHERE clause applied to every database query. Lets you scope tests to a subset of rows (e.g., a specific brand or time range). "1=1" is the neutral default (no filtering).	"1=1"
--parity-sample	Number of rows to sample for row-parity testing. The script randomly picks up to this many primary-key combinations from the CSV and checks if they exist in the DB. Skipped if no primary key.	200
--perf-repeats	Number of times to repeat each key query (KPIs, discount bands, brand share) for latency measurement. The script records P50, P95, and P99 response times across these runs.	20
--out-json	Output path for the machine-readable report (JSON). Contains accuracy metrics, latency percentiles, and row-parity results.	reports/report.json
--out-md	Output path for the human-readable Markdown report. Summarises key accuracy and performance findings for easy sharing or inclusion in client documentation.	reports/report.md

This will execute our pipeline testrunner test script, and generate reports in reports folder.

We have executed performance tests in the below instance (digital ocean)

OS: Ubuntu 25.04 x64
vCPUs: 4 vCPUs
RAM/SSD: 8GB / 240GB Disk
Family: General compute

Note

Due to cost considerations, testing was conducted on a modest DigitalOcean droplet (4 vCPU / 8 GB RAM / 240 GB SSD). Consequently, the KPI, discount-band, and brand-share queries measured around 0.8–1.1 s P95.

In production we recommend AWS m6in (or equivalent) instances. These are powered by 3rd-Gen Intel Xeon Scalable (“Ice Lake”) CPUs up to 3.5 GHz, with 200 Gbps networking and 80–100 Gbps EBS throughput, and scale to 128 vCPUs / 512 GiB RAM. Our enterprise MonkDB customers running similar analytics consistently achieve sub-300 ms P95 latencies on such hardware.

This means the latencies observed on DigitalOcean should be viewed as conservative; significantly lower numbers are expected on production-grade instances.

🔒 Data Privacy & Compliance

Public Data Source: The dataset (Myntra Product Listings) is released under a CC0 Public Domain license and contains only publicly available product catalogue information—no personal or sensitive consumer data.
Secure Credentials: Database and API credentials are stored in .env and config.ini, never hard-coded or exposed in logs or the repository. This can secured even further by bringing in concepts like Hashicorp's Vault or such alternatives in production environment.
Read-only Analytics: All analytics operations use MonkDB's MCP SELECT-only interface; no user-identifiable or private data is written or modified.
Logging Hygiene: Application logs exclude secrets and comply with data minimisation best practices.

Notes

You may swap

langchain with another agentic framework.
Streamlit with another frontend framework.
Mistral model with another LLM model which is a pre-requisite for agentic framework.

License

This repo is licensed under permissive Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
agents		agents
analytics_out		analytics_out
assets		assets
config		config
csv_folder		csv_folder
datasets		datasets
int_examples		int_examples
reports		reports
.DS_Store		.DS_Store
.env.sample		.env.sample
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
csv_insertion_batch.py		csv_insertion_batch.py
gen_insights_force.py		gen_insights_force.py
langchain_orch.py		langchain_orch.py
logo.png		logo.png
monkdb_pipeline_testrunner.py		monkdb_pipeline_testrunner.py
orchestrator.py		orchestrator.py
oss_licenses.md		oss_licenses.md
requirements.txt		requirements.txt
run_mcp_analytics.py		run_mcp_analytics.py
streamlit_app.py		streamlit_app.py
utils.py		utils.py
watchdog_.py		watchdog_.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Agentic AI in Retail, Powered by MonkDB

✨ Features

🚀 Getting Started

1️⃣ Clone the repo

2️⃣ Install dependencies

3️⃣ Run the Watchdong

4️⃣ Move `_sample_products.csv` to `csv_folder/`

Data Flow

🛠️ Tech Stack

Demo

Pre-requisites

📊 Example Workflow

Performance Benchmark Test

Note

🔒 Data Privacy & Compliance

Notes

License

About

Uh oh!

Releases

Packages

Languages

License

monkdbofficial/demo.retailagent

Folders and files

Latest commit

History

Repository files navigation

🧠 Agentic AI in Retail, Powered by MonkDB

✨ Features

🚀 Getting Started

1️⃣ Clone the repo

2️⃣ Install dependencies

3️⃣ Run the Watchdong

4️⃣ Move _sample_products.csv to csv_folder/

Data Flow

🛠️ Tech Stack

Demo

Pre-requisites

📊 Example Workflow

Performance Benchmark Test

Note

🔒 Data Privacy & Compliance

Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

4️⃣ Move `_sample_products.csv` to `csv_folder/`

Packages