π Data Engineer
π‘ Strong foundation in Data Science, Gen AI, building scalable data & AI systems
π Hyderabad, India
Iβm an Data Engineer at Accordion India with ~2 years of hands-on experience working across data engineering, analytics, and Generative AI.
I enjoy building end-to-end systems β from ingesting and transforming large-scale data to applying ML & LLM-based intelligence and delivering insights through dashboards and APIs.
Iβve worked on production-grade ETL pipelines, BI systems, and GenAI-powered applications involving structured, unstructured, and multi-modal data.
- Python, SQL (T-SQL, PL/SQL)
- ETL / ELT Pipelines
- DBT, Snowflake
- SQL Optimization, Data Modeling
- Git, CI/CD
- Azure (Fundamentals)
- Azure Data Factory (Basics)
- Tableau
- Power BI (Basics)
- Streamlit
- Exploratory Data Analysis (EDA)
- Feature Engineering
- Supervised ML (Scikit-learn)
- Generative AI (LLMs, RAG Systems)
- LangChain
Multi-modal AI-powered retail analytics platform
- Integrated 2+ GB of structured & unstructured data, including Google Reviews, CCTV images/footage, and 1-year transactional store data
- Applied LLMs (GPT-4o mini) for cleanliness scoring, counter utilization insights, and AI-generated feedback
- Used YOLOv8 for footfall & people detection
- Built an interactive Streamlit dashboard with a FastAPI backend
- Implemented a LangChain-powered chatbot for querying store performance
π Tech: LangChain, GPT-4o mini, YOLOv8, FastAPI, Streamlit
End-to-end data & ML pipeline for predicting product dimensions
- Processed ~2GB real-time product data (structured + unstructured)
- Performed data cleaning, feature engineering, and supervised ML modeling
- Built a robust pipeline ensuring consistent handling of missing and noisy data
π Tech: Python, Scikit-learn, Random Forest
Interactive WhatsApp chat analytics application
- Generated 10+ insights on communication patterns for personal & group chats
- Built visualizations for activity heatmaps, emoji usage, URL/file sharing, and sentiment analysis
π Tech: Python, Streamlit, Plotly
Data-driven Olympics analytics dashboard
- Visualized historical insights across all Summer Olympics (till 2016)
- Implemented country-wise, athlete-level, sport-wise, and gender-based analysis
π Tech: Pandas, NumPy, Streamlit, Plotly
- πΌ LinkedIn: https://www.linkedin.com/in/souravchakraborty99
- π Portfolio: https://iamsourav.pythonanywhere.com/
- π§ Email: sourav.992001@gmail.com
β If you like my work, feel free to star β the repositories or reach out for collaboration!