Tickit Data Pipeline

Overview

Welcome to the Tickit Data Lake project! The Tickit Data Lake project demonstrates the construction of a scalable and robust data pipeline, leveraging the power of Apache Airflow for orchestration and automation. This project provides a practical example of building a modern data pipeline capable of handling the extraction, loading, and transformation (ELT) of batch data, specifically designed to support the analytical needs of a business.

Key Features and Technologies:

Automated Orchestration: Airflow is the core orchestration engine, responsible for scheduling, monitoring, and managing the entire data pipeline. It defines the workflow as a Directed Acyclic Graph (DAG), ensuring dependencies between tasks are correctly handled. Airflow's robust features enable task retries, logging, and alerting, ensuring pipeline reliability.
Integration of Multiple Data Sources: The project seamlessly integrates with various data sources including:

On-premises SQL and NoSQL databases
Cloud-hosted SQL and NoSQL databases

Value

This project serves as a valuable example of building a modern data pipelines using Airflow, showcasing best practices for data ingestion, processing, and transformation. It provides a solid foundation for building a robust data platform to support a wide range of analytical needs.

Project Setup

Clone the repository.

git clone https://github.com/jibbs1703/Tickit-Data-Pipeline
cd Tickit-Data-Pipeline

# Build Tickit Test Container 
docker build -t test-tickit .

# Run Tickit Test Container
docker run -it --name tickit-test-container -v .:/app test-tickit

# Clenup Tickit Test Container
docker stop tickit-test-container
docker rm tickit-test-container

# Cleanup all containers
docker rm $(docker ps -a -q)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.sqlfluff		.sqlfluff
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tickit Data Pipeline

Overview

Key Features and Technologies:

Value

Project Setup

About

Uh oh!

Uh oh!

Languages

License

jibbs1703/Tickit-Data-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Tickit Data Pipeline

Overview

Key Features and Technologies:

Value

Project Setup

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages