Democratizing Reinforcement Learning for LLMs
-
Updated
Apr 4, 2026 - Python
Democratizing Reinforcement Learning for LLMs
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)
[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models
历年ICLR论文和开源项目合集,包含ICLR2021、ICLR2022、ICLR2023、ICLR2024、ICLR2025.
[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)
Ling is a MoE LLM provided and open-sourced by InclusionAI.
Official code for NeurIPS2025 "Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers"
Repo for "Large Language Model Reasoning Failures"
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
A Practitioner's Guide to M(eow)ti Turn Agentic ReinfOrcement learning
official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL
Task Planning and Tracking toolset for Pydantic AI agents, enabling hierarchical task management with subtasks, PostgreSQL storage for multi-tenancy, and an event system for webhooks and callbacks.
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
Add a description, image, and links to the llm-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the llm-reasoning topic, visit your repo's landing page and select "manage topics."