Skip to content
View Anoushka210's full-sized avatar

Block or report Anoushka210

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Anoushka210/README.md

Hey, I'm Anoushka πŸ‘‹

BE Information Technology Β· 2nd Year Β· Passionate about Data

LinkedIn GitHub Resume


About Me

I'm a second-year IT student who loves turning raw data into meaningful stories. Currently diving deep into data analysis, visualisation & ML β€” making numbers make sense.

  • πŸŽ“ BE in Information Technology (2nd Year)
  • πŸ“Š Interested in Data Analysis, Visualisation & AI/ML
  • 🌱 Currently learning: PySpark, NLP, and data pipeline engineering
  • πŸ” Always curious, always building

πŸ›  Tech Stack

Languages

Python Java C++ JavaScript

Data & ML

Pandas Matplotlib Seaborn PySpark scikit-learn Jupyter

Tools

Git VS Code


πŸš€ Projects

End-to-end PySpark ETL pipeline on the Olist Brazilian E-Commerce dataset with 100K+ orders across multiple relational tables. Features modular architecture, feature engineering for delivery performance, late order flagging, and Parquet-format master dataset for optimized downstream reads. PySpark ETL Parquet Big Data


Retrieval-based AI chatbot using TF-IDF vectorization and Cosine Similarity to map user queries to a predefined knowledge base. Includes a hybrid engine for FAQs + small talk, confidence threshold gating, and performance analytics visualized with Matplotlib & Seaborn. Python NLP TF-IDF scikit-learn Matplotlib Seaborn


Performed exploratory data analysis on 20,000+ global air quality records, identifying pollution trends across 50+ cities and visualizing seasonal PM2.5 patterns. Python Pandas Matplotlib


OOP-based inventory system in Java utilizing inheritance and polymorphism for product categorization, file-based persistence via serialization, and automated daily stock monitoring & report generation. Java OOP Serialization


⭐ Feel free to explore my repositories and drop a star if something interests you!

Pinned Loading

  1. inventory-management-java inventory-management-java Public

    A simple Java-based smart inventory management system demonstrating core object-oriented programming concepts and basic stock tracking logic.

    Java

  2. air-quality-analysis air-quality-analysis Public

    An Exploratory Data Analysis and Visualization (EDAV) project examining global air pollution patterns. Features rigorous data cleaning, advanced 3D visualizations, K-Means clustering for city profi…

    Jupyter Notebook

  3. Smart-FAQ-Chatbot-AI-Agent Smart-FAQ-Chatbot-AI-Agent Public

    An intelligent FAQ chatbot agent built with Python using TF-IDF Vectorization and Cosine Similarity for natural language query matching. Features include small-talk handling, confidence scoring, an…

    Python

  4. pyspark-ecommerce-etl-pipeline pyspark-ecommerce-etl-pipeline Public

    Production-style PySpark ETL pipeline processing 100K+ e-commerce records with optimized joins, feature engineering, and scalable Parquet outputs.

    Jupyter Notebook

  5. Srividhyambika/Question-paper-analyzer Srividhyambika/Question-paper-analyzer Public

    JavaScript