In this project my focus was on cleaning the raw data of the Nashvile Housing excel file. I show, step by step, my aproach to ensure that the data is properly cleaned and ready to be used in a market analysis.
I used Microsoft SQL Server because the data was to large to perform such a task in Excel. I don't say it is not posibile but is not the best option, because bigger the file bigger the execution time on Excel when appling filters. or droping the dupplicates, etc. Also Excel has a limit of rows and depending on the computer performance it can takes good amount of time until display information when you scroll thru data. For task like cleaning, or analysing large amount of data SQL is a better option.