I am a data professional transitioning from Data Analysis to Data Engineering. My background in extracting insights gives me a unique perspective: I build pipelines not just to move data, but to ensure it arrives clean, reliable, and ready for business impact.
Currently, I am focused on mastering the Modern Data Stack (MDS), implementing Data Lakehouses, and adhering to software engineering best practices like CI/CD and containerization.
🔭 Current Focus: Architecting a scalable ETL pipeline for Brazilian Corporate Data (CNPJ) processing terabytes of public information using Spark, Docker, and MinIO.
I have publicly committed to coding every single day to accelerate my transition to Data Engineering.
| Current Project | Stack | Status |
|---|---|---|
| CNPJ Analytics Pipeline | Python, Docker, MinIO, PostgreSQL, dbt | 🚧 In Progress (Building Architecture) |
Check my daily progress and code commits in my repositories!
I organize my stack by function to demonstrate architectural understanding.
Automation as a Baseline: Repetition is a signal. If I do it twice, I script it. If I do it a third time, I automate it end-to-end and turn it into a maintainable pipeline.
Data Quality by Design: Data quality is not optional. I enforce validation at ingestion using tools like Pydantic and Great Expectations to ensure that every downstream system receives clean, trustworthy data.
Documentation with Purpose: Code explains how things work; documentation explains why. I prioritize clear, human-friendly READMEs that accelerate onboarding and decision-making.

