Skip to content

Projects related Data Visualisation, Cleaning, Preprocessing, Machine Learning, Deep Learning, ANN and CNN Projects and Model Training and Model Evaluation

Notifications You must be signed in to change notification settings

WhereisHussain/Data-Science

Repository files navigation

🧠 Data Science Projects Repository

This repository contains a collection of practical data science projects covering the complete machine learning workflow, including:

  • 🧹 Data Cleaning
  • πŸ§ͺ Data Preprocessing
  • πŸ“Š Data Visualization
  • πŸ—οΈ Model Building & Compilation
  • πŸƒ Model Training & Evaluation

These projects aim to help learners and practitioners understand each phase of working with data and machine learning models.


πŸ“ Project Structure

1. 🧼 Data Cleaning Project

Objective:
Clean and standardize a raw dataset with missing values, duplicates, incorrect data types, and inconsistent formatting.

Techniques Used:

  • Handling missing data (mean, median, drop)
  • Removing duplicates
  • Converting data types
  • String formatting and trimming
  • Date and time conversion

Tools: pandas, numpy

πŸ“„ File: data_cleaning.py


2. βš™οΈ Data Preprocessing Project

Objective:
Prepare clean data for machine learning algorithms by transforming features and labels.

Techniques Used:

  • Feature scaling (StandardScaler, MinMaxScaler)
  • Encoding categorical variables (OneHotEncoder, LabelEncoder)
  • Train-test split
  • Data balancing (optional: SMOTE)

Tools: pandas, scikit-learn, numpy


3. πŸ“ˆ Data Visualization Project

Objective:
Understand the dataset using visual exploration techniques and identify patterns or anomalies.

Techniques Used:

  • Histograms, box plots, scatter plots
  • Correlation heatmaps
  • Pair plots
  • Class distribution graphs

Tools: matplotlib, seaborn, pandas


4. 🧠 Model Compilation & Training Project

Objective:
Build a machine learning or deep learning model, compile it with appropriate configurations, and train it on prepared data.

Steps Covered:

  • Defining a model (ML or DL)
  • Choosing loss function, optimizer, metrics
  • Model training with validation
  • Accuracy and loss plots

Tools: scikit-learn, keras / tensorflow, matplotlib


5. πŸ” Evaluation & Testing

Objective:
Evaluate model performance using appropriate metrics and visualize the results.

Evaluation Metrics:

  • Accuracy, precision, recall, F1-score
  • Confusion matrix
  • ROC-AUC (for classification)

About

Projects related Data Visualisation, Cleaning, Preprocessing, Machine Learning, Deep Learning, ANN and CNN Projects and Model Training and Model Evaluation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published