# ๐๏ธ SQL Retail Analysis - P1
## ๐ Project Overview
This project performs an **exploratory and analytical study of retail sales data** using **PostgreSQL**.
It involves creating a retail sales table, cleaning the data, and running a series of SQL queries to uncover meaningful business insights such as top customers, best-selling months, and sales patterns across categories and time shifts.
---
## ๐งฑ Table Schema
**Table Name:** `retail_sales`
```sql
CREATE TABLE retail_sales(
transactions_id INT PRIMARY KEY,
sale_date DATE,
sale_time TIME,
customer_id INT,
gender VARCHAR(10),
age INT,
category VARCHAR(15),
quantiy INT,
price_per_unit FLOAT,
cogs FLOAT,
total_sale FLOAT
);
๐ Data Cleaning
โ Checking for NULL values
```sql
SELECT * FROM retail_sales
WHERE
transactions_id IS NULL OR
sale_date IS NULL OR
sale_time IS NULL OR
customer_id IS NULL OR
gender IS NULL OR
age IS NULL OR
category IS NULL OR
quantiy IS NULL OR
price_per_unit IS NULL OR
cogs IS NULL OR
total_sale IS NULL;
๐งน Deleting NULL records
```sql
DELETE FROM retail_sales
WHERE
transactions_id IS NULL OR
sale_date IS NULL OR
sale_time IS NULL OR
customer_id IS NULL OR
gender IS NULL OR
age IS NULL OR
category IS NULL OR
quantiy IS NULL OR
price_per_unit IS NULL OR
cogs IS NULL OR
total_sale IS NULL;
๐ Data Exploration
1๏ธโฃ Total number of sales
```sql
SELECT COUNT(*) AS total_sale FROM retail_sales;
2๏ธโฃ Unique product categories
```sql
SELECT DISTINCT category FROM retail_sales;
๐ Data Analysis & Findings
Q1. Retrieve all sales made on 2022-11-05
```sql
SELECT * FROM retail_sales WHERE sale_date='2022-11-05';
Q2. Find transactions where category is 'Clothing', quantity < 10, and sale month is Nov 2022
```sql
SELECT * FROM retail_sales
WHERE category='Clothing'
AND quantiy<10
AND TO_CHAR(sale_date,'YYYY-MM')='2022-11';
Q3. Total sales by category
```sql
SELECT category, SUM(total_sale) AS total_sales
FROM retail_sales
GROUP BY category;
Q4. Average age of customers from Beauty category
```sql
SELECT AVG(age) AS average_age_of_customers_from_beauty_category
FROM retail_sales
WHERE category='Beauty';
Q5. Transactions where total sale > 1000
```sql
SELECT * FROM retail_sales WHERE total_sale > 1000;
Q6. Total number of transactions by gender and category
```sql
SELECT category, gender, COUNT(transactions_id)
FROM retail_sales
GROUP BY category, gender;
Q7. Find best-selling month (highest average sales) in each year
```sql
SELECT * FROM (
SELECT
EXTRACT(YEAR FROM sale_date) AS yer,
EXTRACT(MONTH FROM sale_date) AS mnth,
AVG(total_sale),
RANK() OVER(PARTITION BY EXTRACT(YEAR FROM sale_date) ORDER BY AVG(total_sale) DESC) AS rnk
FROM retail_sales
GROUP BY yer, mnth
) AS t
WHERE rnk=1;
Q8. Top 5 customers by highest total sales
```sql
SELECT customer_id, SUM(total_sale)
FROM retail_sales
GROUP BY customer_id
ORDER BY SUM(total_sale) DESC
LIMIT 5;
Q9. Unique customers per category
```sql
SELECT category, COUNT(DISTINCT(transactions_id))
FROM retail_sales
GROUP BY category;
Q10. Sales by time shift (Morning / Afternoon / Evening)
```sql
WITH new AS (
SELECT *,
CASE
WHEN EXTRACT(HOUR FROM sale_time) < 12 THEN 'Morning'
WHEN EXTRACT(HOUR FROM sale_time) BETWEEN 12 AND 17 THEN 'Afternoon'
ELSE 'Evening'
END AS shift
FROM retail_sales
)
SELECT shift, COUNT(*)
FROM new
GROUP BY shift;
๐ง Insights & Observations
- The Clothing category contributes significant sales volume but lower average sale per transaction.
- The Beauty category attracts younger customers on average.
- Evening shifts generate more transactions compared to other time slots.
- The top 5 customers account for a major portion of total revenue.
- Certain months outperform others consistently, indicating seasonal sales trends.
๐งฉ Tech Stack
- Database: PostgreSQL
- Tool: pgAdmin 4
- Language: SQL
๐ How to Run
- Open pgAdmin 4 or your preferred SQL client.
- Create the retail_sales table using the provided schema.
- Import or insert data into the table.
- Run the SQL queries step by step for analysis.