6 34 13

Bingxiang He

hbx

https://hbx-hbx.github.io/

AI & ML interests

NLP

Recent Activity

upvoted a paper 1 day ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

upvoted a paper 1 day ago

Qwen-AgentWorld: Language World Models for General Agents

upvoted a paper 2 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

View all activity

Organizations

upvoted 2 papers 1 day ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 3 days ago • 53

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 3 days ago • 106

upvoted a paper 2 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 4 days ago • 74

upvoted a paper 8 days ago

Rethinking the Role of Efficient Attention in Hybrid Architectures

Paper • 2606.15378 • Published 13 days ago • 17

upvoted a collection 23 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 5 items • Updated 22 days ago • 3

upvoted a paper 27 days ago

Advancing Creative Physical Intelligence in Large Multimodal Models

Paper • 2605.26396 • Published May 25 • 21

upvoted 2 papers about 1 month ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published May 18 • 30

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 165

upvoted a paper about 2 months ago

CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

Paper • 2605.02910 • Published May 6 • 23

upvoted a paper 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

upvoted 2 papers 4 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Paper • 2602.09443 • Published Feb 10 • 59

upvoted a collection 6 months ago

JustRL

Collection

3 items • Updated May 2 • 5

upvoted a paper 6 months ago

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 31

upvoted a paper 7 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 135

upvoted a paper 8 months ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 23

upvoted 3 papers 9 months ago

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning

Paper • 2509.19736 • Published Sep 24, 2025 • 12

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 62

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 119

upvoted a paper 10 months ago

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published Sep 11, 2025 • 81

Bingxiang He

AI & ML interests

Recent Activity

Organizations

hbx's activity