Pengyu Cheng Linear95

I am currently a researcher at Alibaba Group, leading Quark Foundation LLM RL Team. Here are some facts about me:

I am currently focusing on Agentic & RL training of LLMs.
I was previously in the RL & Agent Team at Moonshot AI (Kimi), and the Hunyuan LLM Team at Tencent AI Lab.
I worked on LLM Self-play, Alignment (RLHF), Text Generation, and NLP Fairness.
I am also interested in probabilistic and information-theoretic machine learning methods.
I received my Ph.D. degree from Duke University in 2021, advised by Dr. Lawrence Carin.
I graduated from the Department of Mathematical Sciences at Tsinghua University in 2017, advised by Dr. Jiwen Lu.

Provide feedback