Pinned Loading
-
CONSTRAINT-DECOMPOSITION-for-Multi-Objective-RLHF
CONSTRAINT-DECOMPOSITION-for-Multi-Objective-RLHF PublicEarly-stage research exploring decomposed reward modeling for complex instruction-following in large language models.
Python 1
-
ai-physicist-central-llm
ai-physicist-central-llm PublicA specialized language model architecture for physics reasoning, combining a central LLM "brain" with external computational "hands" for enhanced problem-solving capabilities.
Python 1
-
LLM-Drift-Observatory
LLM-Drift-Observatory PublicA hands-on framework for detecting and visualizing **behavioral drift** in Large Language Models (LLMs) across versions and providers.
Jupyter Notebook 1
-
RL-for-LLM-training-Variance-Stabilized-Dropout-Implementation
RL-for-LLM-training-Variance-Stabilized-Dropout-Implementation PublicRL Stabilized Dropout
Python 1
-
Reinforcement-learning_ML_Profiler
Reinforcement-learning_ML_Profiler PublicThis repo implements a realistic ML engineering task. Think of it like a mini-version of what you'd build at an ML company to profile model behavior during fine-tuning experiments.
Python 1
If the problem persists, check the GitHub status page or contact support.