3 16 1

Huanxin Sheng

HuanxinSheng

https://brucesheng1202.github.io/index.html

AI & ML interests

None yet

Recent Activity

upvoted a collection 11 days ago

💻 Qwopus-Coder

upvoted a collection 16 days ago

Qwen3

upvoted a paper 18 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

View all activity

Organizations

upvoted a collection 11 days ago

💻 Qwopus-Coder

Collection

Reasoning-distilled coding models optimized for specialized domains like agentic workflows. • 7 items • Updated 13 days ago • 25

upvoted a collection 16 days ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.82k

upvoted a paper 18 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

upvoted a paper about 1 month ago

iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published Feb 9 • 19

commented a paper about 2 months ago

To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

Paper • 2602.12566 • Published Feb 13 • 1 •

upvoted 2 papers about 2 months ago

To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

Paper • 2602.12566 • Published Feb 13 • 1

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 112

upvoted a collection 2 months ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 16 days ago • 168

commented 4 papers 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113 •

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 19 •

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113 •

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published Apr 14 • 19 •

upvoted a paper 2 months ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 41

upvoted a paper 3 months ago

Self-Distilled RLVR

Paper • 2604.03128 • Published Apr 3 • 179

authored a paper 4 months ago

Video-Based Reward Modeling for Computer-Use Agents

Paper • 2603.10178 • Published Mar 10 • 43

upvoted a paper 4 months ago

Video-Based Reward Modeling for Computer-Use Agents

Paper • 2603.10178 • Published Mar 10 • 43

upvoted a paper 5 months ago

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 207

upvoted 2 papers 7 months ago

Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published Nov 25, 2025 • 128

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

upvoted a paper 8 months ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 23

Huanxin Sheng

AI & ML interests

Recent Activity

Organizations

HuanxinSheng's activity