This repository contains all the tasks completed by me during my internship at Maincraft Technology.
Each task is organized in a separate folder and documents the work, learning, and outcomes of the internship.
Overview
This task focuses on analyzing student performance using real educational datasets. The goal was to understand how different factors such as study time, gender, and background affect students’ final grades.
Technologies Used
Python, Pandas, NumPy, Matplotlib, Seaborn, Jupyter Notebook
Work Done
- Loaded and explored the Math and Portuguese student datasets
- Performed data cleaning and validation
- Analyzed student performance and influencing factors
- Created visualizations to represent insights
- Merged both datasets using the exact logic provided in the reference R file
- Identified 382 students common to both datasets
Learning Outcomes
- Learned how to work with real-world datasets
- Improved skills in data cleaning and exploratory data analysis
- Understood how to validate results using a reference implementation
- Gained experience in visualizing data for better interpretation
Notes
This task was completed following standard data science practices and reflects internship-level analytical work.
Overview
This task focuses on analyzing the Titanic dataset to understand survival patterns among passengers. The goal was to analyze how factors such as gender, passenger class, and age affected survival chances.
Technologies Used
Python, Pandas, Matplotlib, Seaborn, Jupyter Notebook
Work Done
- Loaded and explored the Titanic dataset
- Performed data cleaning by handling irrelevant columns and missing values
- Conducted exploratory data analysis (EDA) on passenger survival
- Analyzed survival rates based on gender, passenger class, and age groups
- Created bar charts and histograms to visualize survival patterns
- Used numerical analysis along with visualizations to support conclusions
Learning Outcomes
- Learned how to handle real-world messy datasets
- Improved skills in exploratory data analysis and data visualization
- Understood how to answer business questions using data
- Gained experience in interpreting numerical and graphical results
Notes
This task was completed following standard data science and EDA practices and reflects internship-level analytical work.
Overview
This task focuses on performing an advanced exploratory data analysis (EDA) on the Titanic dataset to uncover deeper survival patterns. The objective was to move beyond basic analysis and extract meaningful insights using feature engineering and structured visual storytelling.
Technologies Used
Python, Pandas, Matplotlib, Seaborn, Jupyter Notebook
Work Done
- Loaded and analyzed the Titanic training dataset
- Handled missing values using appropriate imputation techniques
- Removed irrelevant columns to improve dataset quality
- Created new features such as Age Group and Family Size (SibSp + Parch + 1)
- Analyzed survival rates based on:
- Age Groups
- Embarkation Ports
- Family Size
- Generated advanced visualizations including:
- Age distribution histogram
- Correlation heatmap
- Survival rate bar plots
- Interpreted patterns using both statistical summaries and graphical analysis
Learning Outcomes
- Strengthened understanding of feature engineering in data analysis
- Improved ability to clean and transform real-world datasets
- Gained deeper experience in correlation analysis and visualization techniques
- Learned how to derive structured insights from raw data
- Enhanced skills in presenting data findings in a professional format
Notes
This task demonstrates advanced EDA skills and reflects structured analytical thinking applied to real-world data as part of internship-level data science work.
- Each task folder is self-contained and includes all relevant files.
- Code is written with clarity and proper comments.
- This repository represents my learning and growth during the internship.
Anmol Patel
Intern – Maincraft Technology