This project involves analyzing the Nashville housing market using SQL. The dataset, sourced from Kaggle, contains various attributes related to properties sold in Nashville.
The primary objective of this project is to clean and transform raw housing data to extract meaningful insights into Nashville's real estate trends. The process includes:
- Data Extraction: Importing the raw dataset into a SQL environment.
- Data Cleaning and Transformation: Addressing inconsistencies, handling missing values, and standardizing data formats.
- Data Loading: Storing the cleaned data into a structured SQL database for analysis.
The data cleaning process involves several key steps:
- Standardizing Date Formats: Converting sale dates into a consistent format.
- Handling Missing Values: Filling in missing property addresses by cross-referencing ParcelID.
- Splitting Address Components: Separating full addresses into distinct columns for street address, city, and state.
- Normalizing Categorical Data: Ensuring consistency in categorical fields, such as converting 'Y'/'N' to 'Yes'/'No' in the
SoldAsVacant
column.
For detailed SQL queries and procedures used in the data cleaning process, refer to the Data_Cleaning.sql
file in this repository.
Nashville Housing Data (RAW DATA).xlsx
: The original dataset containing raw housing data.Data_Cleaning.sql
: SQL script detailing the data cleaning and transformation steps applied to the dataset.
To replicate or build upon this analysis:
- Set Up Your Environment: Ensure you have a SQL database management system installed (e.g., Microsoft SQL Server).
- Import the Dataset: Load the
Nashville Housing Data (RAW DATA).xlsx
file into your SQL environment. - Execute Data Cleaning Scripts: Run the SQL commands provided in
Data_Cleaning.sql
to clean and transform the data. - Analyze the Data: Perform queries to extract insights, such as trends in property sales, average prices, and distribution of property types.
- Data Source: Nashville Housing Data on Kaggle
- Inspiration: This project was inspired by various data cleaning and analysis examples, including Nashville Housing Data Analysis by Esther Abel.
For any questions or further information, please refer to the issues section of this repository.