Huanxin Sheng

HuanxinSheng

·

https://brucesheng1202.github.io/index.html

AI & ML interests

None yet

Recent Activity

upvoted a collection 13 days ago

💻 Qwopus-Coder

upvoted a collection 19 days ago

upvoted a paper 20 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

View all activity

Organizations

commented 5 papers 2 months ago

To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

Paper • 2602.12566 • Published Feb 13 • 1 •

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113 •

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 19 •

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113 •

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 19 •