Huanxin Sheng

HuanxinSheng

·

https://brucesheng1202.github.io/index.html

AI & ML interests

None yet

Recent Activity

upvoted a collection 16 days ago

💻 Qwopus-Coder

upvoted a collection 22 days ago

upvoted a paper 24 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

View all activity

Organizations

upvoted a collection 16 days ago

💻 Qwopus-Coder

Reasoning-distilled coding models optimized for specialized domains like agentic workflows. • 10 items • Updated 3 days ago • 34

upvoted a collection 22 days ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.82k

upvoted a paper 24 days ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

upvoted a paper about 1 month ago

iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published Feb 9 • 19

upvoted 2 papers 2 months ago

To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

Paper • 2602.12566 • Published Feb 13 • 1

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 112

upvoted a collection 2 months ago

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 22 days ago • 168

upvoted a paper 2 months ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 41

upvoted a paper 3 months ago

Self-Distilled RLVR

Paper • 2604.03128 • Published Apr 3 • 179

upvoted a paper 4 months ago

Video-Based Reward Modeling for Computer-Use Agents

Paper • 2603.10178 • Published Mar 10 • 43

upvoted a paper 5 months ago

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 207

upvoted 2 papers 7 months ago

Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published Nov 25, 2025 • 129

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 242

upvoted 3 papers 9 months ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 23

Analyzing Uncertainty of LLM-as-a-Judge: Interval Evaluations with Conformal Prediction

Paper • 2509.18658 • Published Sep 23, 2025 • 1

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Paper • 2510.05034 • Published Oct 6, 2025 • 51