Skip to content
View epaunova's full-sized avatar

Block or report epaunova

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. CONSTRAINT-DECOMPOSITION-for-Multi-Objective-RLHF CONSTRAINT-DECOMPOSITION-for-Multi-Objective-RLHF Public

    Early-stage research exploring decomposed reward modeling for complex instruction-following in large language models.

    Python 1

  2. ai-physicist-central-llm ai-physicist-central-llm Public

    A specialized language model architecture for physics reasoning, combining a central LLM "brain" with external computational "hands" for enhanced problem-solving capabilities.

    Python 1

  3. RAG-Demo RAG-Demo Public

    A minimal, end-to-end example of a Retrieval-Augmented Generation (RAG) pipeline using Python, LangChain, OpenAI, and pgvector. It shows how to ingest unstructured documents, index them for semanti…

    Jupyter Notebook 1

  4. LLM-Drift-Observatory LLM-Drift-Observatory Public

    A hands-on framework for detecting and visualizing **behavioral drift** in Large Language Models (LLMs) across versions and providers.

    Jupyter Notebook 1

  5. RL-for-LLM-training-Variance-Stabilized-Dropout-Implementation RL-for-LLM-training-Variance-Stabilized-Dropout-Implementation Public

    RL Stabilized Dropout

    Python 1

  6. Reinforcement-learning_ML_Profiler Reinforcement-learning_ML_Profiler Public

    This repo implements a realistic ML engineering task. Think of it like a mini-version of what you'd build at an ML company to profile model behavior during fine-tuning experiments.

    Python 1