Skip to content
View Linear95's full-sized avatar

Highlights

  • Pro

Block or report Linear95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Linear95/README.md

Hi there 👋

I am currently a researcher at Alibaba Group, leading Quark Foundation LLM RL Team. Here are some facts about me:

  • I am currently focusing on Agentic & RL training of LLMs.
  • I was previously in the RL & Agent Team at Moonshot AI (Kimi), and the Hunyuan LLM Team at Tencent AI Lab.
  • I worked on LLM Self-play, Alignment (RLHF), Text Generation, and NLP Fairness.
  • I am also interested in probabilistic and information-theoretic machine learning methods.
  • I received my Ph.D. degree from Duke University in 2021, advised by Dr. Lawrence Carin.
  • I graduated from the Department of Mathematical Sciences at Tsinghua University in 2017, advised by Dr. Jiwen Lu.

Pinned Loading

  1. CLUB CLUB Public

    Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

    Jupyter Notebook 351 42

  2. SPAG SPAG Public

    Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024

    Python 142 24

  3. Alibaba-Quark/SSP Alibaba-Quark/SSP Public

    Search Self-Play: Pushing the Frontier of Agent Capability without Supervision

    Python 65 4

  4. APO APO Public

    Code for ACL2024 paper - Adversarial Preference Optimization (APO).

    Python 56 3

  5. DSP DSP Public

    Domain-specific preference (DSP) data and customized RM fine-tuning.

    Python 25 3

  6. bert-intent-slot-detector bert-intent-slot-detector Public

    BERT-based intent and slots detector for chatbots.

    Python 225 30