Skip to content
View marvinernst1020's full-sized avatar
👾
👾

Block or report marvinernst1020

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
marvinernst1020/README.md

Hi, I'm Marvin

I'm a Data Scientist with a strong mathematical foundation and a passion for building reliable and interpretable machine learning systems.
My experience spans deep learning (VAEs, Transformers), NLP, probabilistic modeling, and large-scale data processing (Spark, PySpark, Airflow).

Currently working on:

  • Variational Autoencoder architectures for spectral time series (astrophysical data)
  • Time series forecasting models using probabilistic & deep learning approaches
  • Data automation tools in Python (Poetry, pandas, openpyxl)

Interests: Mathematical modeling • Bayesian inference • NLP • Reinforcement Learning • Scientific computing

Tech Stack: Python • TensorFlow • PyTorch • Pandas • Spark • Airflow • AWS • Docker • GitHub Actions • R • MatLab

Connect with me: LinkedInEmail


Feel free to check out my repositories below - many include reproducible Jupyter notebooks and pipelines.

Pinned Loading

  1. aumic-spectral-autoencoder aumic-spectral-autoencoder Public

    Latent-space modeling of stellar magnetic variability using VAEs on high-resolution CARMENES VIS_A spectra of AU Microscopii.

    Jupyter Notebook 1

  2. bandits-for-algorithm-selection bandits-for-algorithm-selection Public

    Bayesian and spatial bandit algorithms for adaptive model selection under non-stationary reward dynamics, featuring latent-state inference and Gaussian Process exploration.

    Jupyter Notebook 2

  3. spark-based-data-lake spark-based-data-lake Public

    End-to-end Spark-based data lake and analysis pipelines for managing, transforming, and visualizing large-scale datasets using PySpark and Airflow.

    Jupyter Notebook 1

  4. graph-data-modeling-neo4j graph-data-modeling-neo4j Public

    End-to-end workflow for modeling, building, and analyzing a large-scale academic citation graph using Neo4j. Includes data extraction from DBLP, graph modeling, import automation, Cypher querying, …

    Jupyter Notebook 1

  5. contextual-bandit-benchmark contextual-bandit-benchmark Public

    Forked from otausendschoen/contextual-bandit-benchmark

    Contextual bandits from real (OBP) and synthetic data with LinUCB, Thompson, and ε-Greedy. We benchmark online learning and perform rigorous offline policy evaluation via IPW, DM, and Doubly Robust…

    Jupyter Notebook

  6. mwc2025-hotel-price-analysis mwc2025-hotel-price-analysis Public

    An empirical analysis of how large-scale events impact hotel prices - combining web scraping, text mining, and econometric modeling (Difference-in-Differences) to estimate the effect of the 2025 Mo…

    Jupyter Notebook 1