SQL Challenge - EmployeeSQL

Module 9
EdX(2U) & UT Data Analytics and Visualization Bootcamp
Cohort UTA-VIRT-DATA-PT-11-2024-U-LOLC
By Neel Kumar Agarwal

Introduction

It’s been two weeks since I was hired as a new data engineer at Pewlett Hackard (a fictional company).
My first major task is to do a research project about people whom the company employed during the 1980s
and 1990s. All that remains of the employee database from that period are six CSV files.

For this project, I have to design the tables to hold the data from the CSV files, import the CSV files
into a SQL database, and then answer questions about the data. That is, I'll perform data modeling, data
engineering, and data analysis, respectively.

Challenge Overview

This challenge is interesting in that the most difficult part was the overall Schema creation and data
modeling. While the querying was interesting, it didn't challenge my thinking the same was the schema
creation did, and likely because by doing that work I saved myself a lot of trouble when querying.

Through the completion of this challenge, I learned how strongly good data modeling depends on context.
Understanding how the data was decided to be inputted, whether more data will ever come in, and how
the relationships were envision when the data was originally written. Deciding whether those factors
are important and should be retained, whether they are erroneous/irrelevant, and what can be changed
to enhance database efficiency is what was truly challenging for me.

⬆️ Return to TOC

Entity Relationship Diagram (ERD)

sql-challenge_db DataBase Relationships

⬆️ Return to TOC

Summary Breakdowns:

- Repository Creation and Directory Setup  
- CSV/Pandas DataFrame Analysis of Data  
- Entity Relationship Diagram  
   - Conceptual Relationships  
   - Logical Relationships  
   - Physical Relationships  
- Schema Setup/Initialization  
- CSV File Imports (6) Into New Tables  
- SQL Queries

⬆️ Return to TOC

Setup and Usage

Prerequisites & Instructions

Written in | SQL:2023 (ISO/IEC 9075:2023)
DBMS used | PostgreSQL 17.2

Install PostgreSQL if not already done (ensure version is compatible with 17.2)
Clone this repository
Create DataBase sql-challenge_db if it doesn't exist
Connection to DataBase should be automatic, create a new query tool window or session
Run the schema.sql file in a query window to setup/restart tables
Import CSV Files from sql-challenge_db/EmployeeSQL/data directory into the corresponding tables
Open the queries.sql file in a query window and run block-by-block (be sure to start with the SET datestyle = 'Postgres, MDY'; line)

⬆️ Return to TOC

Directory Structure

sql-challenge/  
|  
|—- EmployeeSQL/  
|   |—- data/  
|   |   —- departments.csv  
|   |   -- dept_emp.csv  
|   |   -- dept_manager.csv  
|   |   -- employees.csv  
|   |   -- salaries.csv  
|   |   -- titles.csv  
|   |  
|   |—- queries.sql  
|   |-- schema.sql

This structure ensures all inputs are organized within their respective folders.

⬆️ Return to TOC

Schema

Main tables are: departments, titles, and employees

departments
- dept_no CHAR(4)
- dept_name VARCHAR(18)
titles
- title_id CHAR(5)
- title VARCHAR(18)
employees
- emp_no INT
- emp_title_id CHAR(5)
- birth_date DATE
- first_name VARCHAR(14)
- last_name VARCHAR(16)
- sex CHAR(1)
- hire_date DATE
dept_emp
- emp_no INT
- dept_no CHAR(4)
dept_manager
- dept_no CHAR(4)
- emp_no INT
salaries
- emp_no INT
- salary INT

⬆️ Return to TOC

Queries

Query 1

-- List the employee number, last name, first name, sex, and salary of each employee
SELECT
   e.emp_no
   ,e.last_name
   ,e.first_name
   ,e.sex
   ,s.salary
FROM
   employees e
INNER JOIN
   salaries s
ON
   e.emp_no = s.emp_no
;

Query 2

-- List the first name, last name, and hire date for the employees who were hired in 1986.
SELECT
   first_name
   ,last_name
   ,hire_date
FROM
   employees
WHERE
   EXTRACT(YEAR FROM hire_date) = 1986
;

Query 3

-- List the manager of each department along with their department number, department name, 
-- employee number, last name, and first name.
SELECT
   m.dept_no
   ,d.dept_name
   ,m.emp_no
   ,e.last_name
   ,e.first_name
FROM
   dept_manager m
INNER JOIN
   departments d
ON
   m.dept_no = d.dept_no
INNER JOIN
   employees e
ON
   m.emp_no = e.emp_no
;

Query 4

-- List the department number for each employee along with that employee’s employee number, 
-- last name, first name, and department name.
SELECT
   de.dept_no
   ,de.emp_no
   ,e.last_name
   ,e.first_name
   ,d.dept_name
FROM
   dept_emp de
INNER JOIN
   employees e
ON
   de.emp_no = e.emp_no
INNER JOIN
   departments d
ON
   de.dept_no = d.dept_no
;

Query 5

-- List first name, last name, and sex of each employee whose first name is Hercules and whose 
-- last name begins with the letter B.
SELECT
   first_name
   ,last_name
   ,sex
FROM
   employees
WHERE
   first_name = 'Hercules' AND
   last_name LIKE 'B%'
;

Query 6

-- List each employee in the Sales department, including their employee number, last name, and 
-- first name.
SELECT
   d.emp_no
   ,e.last_name
   ,e.first_name
FROM
   dept_emp d
INNER JOIN
   employees e
ON
   d.emp_no = e.emp_no
WHERE 
   d.dept_no = (
   SELECT dept_no
   FROM departments
   WHERE dept_name = 'Sales')
ORDER BY d.emp_no ASC
;

Query 7

-- List each employee in the Sales and Development departments, including their employee number, 
-- last name, first name, and department name.
SELECT
   de.emp_no
   ,e.last_name
   ,e.first_name
   ,d.dept_name
FROM
   dept_emp de
INNER JOIN
   employees e
ON
   de.emp_no = e.emp_no
INNER JOIN
   departments d
ON
   de.dept_no = d.dept_no
WHERE
   d.dept_name = 'Sales' OR
   d.dept_name = 'Development'
;

Query 8

-- List the frequency counts, in descending order, of all the employee last names 
-- (that is, how many employees share each last name).
SELECT
   last_name
   ,COUNT(last_name) frequency
FROM
   employees
GROUP BY last_name
ORDER BY frequency DESC
;

⬆️ Return to TOC

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
EmployeeSQL		EmployeeSQL
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQL Challenge - EmployeeSQL

Table of Contents

Introduction

Challenge Overview

Entity Relationship Diagram (ERD)

sql-challenge_db DataBase Relationships

Summary Breakdowns:

Setup and Usage

Prerequisites & Instructions

Directory Structure

Schema

Queries

Query 1

Query 2

Query 3

Query 4

Query 5

Query 6

Query 7

Query 8

About

Uh oh!

Releases

Packages

Neelka96/sql-challenge

Folders and files

Latest commit

History

Repository files navigation

SQL Challenge - EmployeeSQL

Table of Contents

Introduction

Challenge Overview

Entity Relationship Diagram (ERD)

sql-challenge_db DataBase Relationships

Summary Breakdowns:

Setup and Usage

Prerequisites & Instructions

Directory Structure

Schema

Queries

Query 1

Query 2

Query 3

Query 4

Query 5

Query 6

Query 7

Query 8

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages