Skip to content

AvinashBisram/Data-Cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 

Repository files navigation

Data Cleaning Projects

This repository contains Python and SQL projects relating to data cleaning and feature engineering.

Language(s): Python, SQL
Package(s): (Python) Pandas, Seaborn
Software: Jupyter Notebooks

Current Projects:

  • Pandas 2021 Nutrition Data Cleaning: Extensively cleans a 2021 Nutrition dataset and feature engineers 11 new columns with Python's Pandas module to prepare for future visualization. Cleaning process includes Exploratory Data Analysis, removing columns, handling duplicates, data type recasting, and filling missing values. Feature Engineering process creates columns adhering to predefined goals using grouping, aggregation, merging, and more.

Projects Coming Soon:

  • SQL 2021 Nutrition Data Cleaning: Will use MySQL queries to accomplish the same data cleaning and feature engineering process followed in Pandas 2021 Nutrition Data Cleaning, producing the same final CSV files of transformed data.

About

Python and SQL projects for data cleaning and feature engineering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published