- Fundamentals of Data Engineering - Reis & Housley - 2022 - The first comprehensive book on the data engineering lifecycle — pipelines, storage, and serving.
- Streaming Systems - Akidau, Chernyak, Lax - 2018 - How streaming works at a deep level — windows, watermarks, and exactly-once semantics.
- Data Pipelines with Apache Airflow - Harenslak & de Ruiter - 2021 - Practical Airflow for orchestrating real-world data workflows.
- The Data Warehouse Toolkit, 3rd Ed. - Kimball & Ross - 2013 - The foundational guide to dimensional modeling — bedrock of analytical data design.
- Spark: The Definitive Guide - Chambers & Zaharia - 2018 - Comprehensive guide to Apache Spark for large-scale data processing.