Xingtai Lv

XingtaiHF

5 32 6

AI & ML interests

LLM

Recent Activity

upvoted a paper 8 days ago

Qwen-AgentWorld: Language World Models for General Agents

upvoted a paper 8 days ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

upvoted a paper 9 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

View all activity

Organizations

upvoted 2 papers 8 days ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 9 days ago • 144

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 9 days ago • 62

upvoted a paper 9 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 10 days ago • 79

upvoted a paper 30 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Paper • 2605.29343 • Published May 28 • 36

upvoted 3 papers about 1 month ago

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Paper • 2605.20025 • Published May 19 • 190

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

Paper • 2605.18401 • Published May 18 • 130

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published May 18 • 30

upvoted a collection about 1 month ago

ZEDA

Collection

4 items • Updated May 19 • 3

upvoted a paper about 2 months ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 165

upvoted a paper 3 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

upvoted a paper 4 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

upvoted 2 papers 5 months ago

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Paper • 2602.09443 • Published Feb 10 • 59

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Paper • 2601.17367 • Published Jan 24 • 33

upvoted a paper 8 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 135

upvoted 2 papers 9 months ago

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3, 2025 • 99

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Paper • 2509.25123 • Published Sep 29, 2025 • 22

upvoted 4 papers 10 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 119

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 665

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published Sep 11, 2025 • 81

Xingtai Lv

AI & ML interests

Recent Activity

Organizations

XingtaiHF's activity