Lead Data Engineer with 10+ years across data engineering, machine learning, and software engineering. I build data platforms from the ground up and scale them.
Currently exploring RAG pipelines and LLM-powered data systems. Open to Lead/Senior Data Engineering roles in Toronto.
- Design and build end-to-end data platforms (batch + streaming, medallion architecture, facts/dimensions)
- Lead data engineering teams — hiring, mentoring, and shipping
- Bridge data engineering and ML — from pipelines to production models
Data Engineering: Apache Spark, PySpark, Airflow (MWAA), AWS Glue, Databricks, Snowflake, Delta Lake, Azure Data Factory, Synapse
Cloud: AWS (S3, Redshift, RDS, DynamoDB, Lambda) · Azure (Data Lake Gen2, SQL Server, CosmosDB)
ML & AI: PyTorch, Keras, scikit-learn, SparkML, NLP
Languages: Python, SQL, JavaScript
Lead Data Engineer at TalkShopLive — Oct 2023 – Mar 2026
- Founded the data engineering function from zero — built the entire data platform, pipelines, and analytics infrastructure as the first data hire.
- Architected a real-time video analytics system processing millions of events across iOS, Android, Web, and SDK platforms.
- Designed and implemented a custom clickstream collection system (build vs. buy decision that saved significant vendor costs).
Lead Data Engineer at Best Buy Canada — Sep 2020 – Sep 2023
- Led a team of 14 data engineers (onshore + offshore) delivering data products across the organization.
- Built an ML-driven Margin Based Bidding system that generated $1M+ in incremental revenue.
- Established data governance practices and engineering standards across the data platform.
Machine Learning Intern at Flipboard — Sep 2019 – Dec 2019
- Developed and shipped AutoGen — a recommendation engine generating top articles monthly for each topic in Flipboard's database.
- Built and productionized a listicle classifier for news articles using Python/PyTorch/AWS, integrated into the news feed pipeline.
Software Engineer II at Oracle — Jul 2014 – Aug 2017
- Interop specialist for ZFS Storage products — qualified OS/protocol/kernel combinations across Linux, Solaris, AIX, Windows, and Mac OS.
- Built API modules and automated testing for ZFS test infrastructure.
- Led an agile project simulating high-priority customer-facing failure scenarios.
MSc Computer Science — Simon Fraser University (Big Data & Machine Learning, GPA 3.74)
BEng Computer Science — National Institute of Engineering, India (9.2/10)
- CryptoIntel — Crypto analytics dashboard with sentiment analysis, LSTM predictions, and topic modeling
- Semantic Search for Audio — NLP-powered semantic search across speech audio
- Defence Against One-Pixel Attack — Neural network defence mechanisms against adversarial attacks
- Yelp Dataset Analysis — Recommendation engine built with PySpark MLlib

