Skip to content
View sourav200199's full-sized avatar

Block or report sourav200199

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sourav200199/README.md

πŸ‘‹ Hi, I’m Sourav Chakraborty

πŸš€ Data Engineer
πŸ’‘ Strong foundation in Data Science, Gen AI, building scalable data & AI systems
πŸ“ Hyderabad, India


🧠 About Me

I’m an Data Engineer at Accordion India with ~2 years of hands-on experience working across data engineering, analytics, and Generative AI.

I enjoy building end-to-end systems β€” from ingesting and transforming large-scale data to applying ML & LLM-based intelligence and delivering insights through dashboards and APIs.

I’ve worked on production-grade ETL pipelines, BI systems, and GenAI-powered applications involving structured, unstructured, and multi-modal data.


πŸ› οΈ Tech Stack

πŸ‘¨β€πŸ’» Languages

  • Python, SQL (T-SQL, PL/SQL)

πŸ”„ Data Engineering

  • ETL / ELT Pipelines
  • DBT, Snowflake
  • SQL Optimization, Data Modeling
  • Git, CI/CD

☁️ Cloud & Platforms

  • Azure (Fundamentals)
  • Azure Data Factory (Basics)

πŸ“Š Analytics & Visualization

  • Tableau
  • Power BI (Basics)
  • Streamlit

πŸ€– AI / Machine Learning

  • Exploratory Data Analysis (EDA)
  • Feature Engineering
  • Supervised ML (Scikit-learn)
  • Generative AI (LLMs, RAG Systems)
  • LangChain

πŸš€ Featured Projects

πŸͺ Intelligent Store Assistant

Multi-modal AI-powered retail analytics platform

  • Integrated 2+ GB of structured & unstructured data, including Google Reviews, CCTV images/footage, and 1-year transactional store data
  • Applied LLMs (GPT-4o mini) for cleanliness scoring, counter utilization insights, and AI-generated feedback
  • Used YOLOv8 for footfall & people detection
  • Built an interactive Streamlit dashboard with a FastAPI backend
  • Implemented a LangChain-powered chatbot for querying store performance

πŸ”— Tech: LangChain, GPT-4o mini, YOLOv8, FastAPI, Streamlit


πŸ“¦ Dim-Predictor

End-to-end data & ML pipeline for predicting product dimensions

  • Processed ~2GB real-time product data (structured + unstructured)
  • Performed data cleaning, feature engineering, and supervised ML modeling
  • Built a robust pipeline ensuring consistent handling of missing and noisy data

πŸ”— Tech: Python, Scikit-learn, Random Forest


πŸ’¬ Whats-Insight

Interactive WhatsApp chat analytics application

  • Generated 10+ insights on communication patterns for personal & group chats
  • Built visualizations for activity heatmaps, emoji usage, URL/file sharing, and sentiment analysis

πŸ”— Tech: Python, Streamlit, Plotly


πŸ… Statolympics

Data-driven Olympics analytics dashboard

  • Visualized historical insights across all Summer Olympics (till 2016)
  • Implemented country-wise, athlete-level, sport-wise, and gender-based analysis

πŸ”— Tech: Pandas, NumPy, Streamlit, Plotly


πŸ“« Let’s Connect


πŸ“ˆ GitHub Stats

Top Languages


⭐ If you like my work, feel free to star ⭐ the repositories or reach out for collaboration!

Popular repositories Loading

  1. Networks-and-Communication Networks-and-Communication Public

    This repo contains all the important questions for Computer Networks in Python and C++. Also some of the codes are done in Python socket programming for better understanding of the problems

    Python 1

  2. Heart_Disease_Classification Heart_Disease_Classification Public

    This is a typical binary classification problem to predict if a patient has any heart disease or not

    Jupyter Notebook

  3. Bulldozer-price-prediction Bulldozer-price-prediction Public

    A typical regression problem for predicting the bulldozer price, using a time series data

    Jupyter Notebook

  4. PHP_Basics PHP_Basics Public

    This repository contains all the basics of PHP

    PHP

  5. IEEE-Web-Dev-Task1 IEEE-Web-Dev-Task1 Public

    HTML

  6. VIT-Student-Program-Managent VIT-Student-Program-Managent Public

    This is a basic application to register students for an event in a database and edit those changes

    PHP