Skip to content
View Gerardo1909's full-sized avatar
🧭
Looking for opportunities
🧭
Looking for opportunities

Highlights

  • Pro

Block or report Gerardo1909

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Gerardo1909/README.md

Hi there, I'm Gerardo Toboso 👋

Data & Analytics Specialist | Data Engineering & Science

LinkedIn Email Portfolio



I bridge the gap between raw data infrastructure and strategic business insights. Currently completing my BSc in Data Science (Top 10% - GPA 9.0) while building automated, scalable data solutions.


🚀 About Me

I define myself as a Data & Analytics Specialist because I don't just move data; I make it useful. My approach combines the rigor of Data Engineering (robust pipelines, data quality, CI/CD) with the exploratory nature of Data Science.

  • 🔭 Focus: Designing automated ETL/ELT pipelines, Data Warehousing, and decision-ready dashboards.
  • 💼 Experience: Diagnosed a 23.2% server error rate across 1,000 requests, identifying 9 critical endpoints causing 3,700ms+ latency in my latest project.
  • 🌱 Learning: Deepening my knowledge in Apache Airflow, GCP, and DuckDB for modern data stacks.

🛠️ Tech Stack & Tools

Focusing on modern Data Engineering and Analytics architecture.

Domain Tools
Languages Python SQL
Engineering & Cloud Docker GCP Airflow DuckDB
DevOps & CI/CD Git GitHub Actions
Analytics & BI Pandas PowerBI

🏆 Featured Projects

Turning raw access logs into actionable infrastructure insights.

  • The Challenge: Production API showing degraded performance with no visibility into root causes.
  • The Solution: Built a SQL-based diagnostic pipeline with DuckDB + interactive Looker Studio dashboard.
  • Impact: 📉 Identified 9 of 11 endpoints with >20% error rate, pinpointing 3 critical services causing 35.78% of all 5xx errors.

Solving the "stale data" problem for business stakeholders.

  • The Challenge: Sales team spent 2 hours/day manually merging CSVs, leading to errors and delays.
  • The Solution: Built an end-to-end Python ETL pipeline with Parquet optimization and Data Quality checks.
  • Impact: 📉 Reduced reporting latency by 97% (Automated & Daily).

Collaboration with MIT researchers to analyze economic survival in Argentina.

  • The Tech: NLP for survey processing, Geographic segmentation, and Statistical Analysis.
  • Impact: Identified trends contributing to recommendations for a potential 18% sales improvement for local businesses.

📈 GitHub Stats

Gerardo's Stats Top Languages

Pinned Loading

  1. server-logs-sql-analysis server-logs-sql-analysis Public

    Análisis completo de logs generados por un servidor web para detectar patrones en endpoints utilizados y sugerir áreas de mejora para el equipo de desarrollo.

    Jupyter Notebook

  2. ecommerce-reporting-etl ecommerce-reporting-etl Public

    Pipeline automatizado de ETL (Extract, Transform, Load) diseñado para procesar y analizar datos transaccionales de e-commerce, generando métricas de negocio críticas para la toma de decisiones estr…

    Python

  3. my_cv my_cv Public

    Version-controlled CV built with RenderCV — English & Spanish versions, professionally typeset PDFs.

  4. mit_lift_lab_analisis_adelift mit_lift_lab_analisis_adelift Public

    Jupyter Notebook