-
Module 1: Docker, SQL, Terraform
- Introduction to GCP
- Docker and docker-compose
- Running Postgres locally with Docker
- Setting up infrastructure on GCP with Terraform
-
Module 2: Orchestration with Mage AI (Airflow alternative - https://www.mage.ai)
- Data Lake
- Workflow orchestration
- Workflow orchestration with Mage AI
-
Workshop 1: Data Ingestion with dlt (https://dlthub.com)
- Reading from apis
- Building scalable pipelines
- Normalising data
- Incremental loading
-
Module 3: Data Warehouses with BigQuery
- Data Warehouse
- BigQuery
- Partitioning and clustering
- BigQuery best pratices
- BigQuery Machine Learning
-
Module 4: Analytics Engineering with dbt (data build tools - https://getdbt.com)
- Basics of analytics engineering
- BigQuery and dbt | Postgres and dbt
- dbt models
- Testing and documenting
- Deployment to the cloud and locally
- Visualizing the data with google data studio and metabase
-
Module 5: Batch processing with PySpark
- Spark - Dataframes, SQL
- Internals: GroupBy and joins
-
Workshop 2: Stream processing (RisingWave - https://risingwave.com & Redpanda - https://redpanda.com)
-
Couldn't load subscription status.
- Fork 0
zukui1984/data-engineer-zoomcamp_2024
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
