Yaorui SHI's picture

Yaorui SHI

yrshi

·

syr-cn

AI & ML interests

None yet

Recent Activity

upvoted a collection 1 day ago

OpenThinker-Agent2

upvoted a paper 1 day ago

DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation

upvoted a paper 9 days ago

FastContext: Training Efficient Repository Explorer for Coding Agents

View all activity

Organizations

upvoted a collection 1 day ago

OpenThinker-Agent2

OpenThinker-Agent2: agentic SFT/RL datasets and 8B/32B models (cold-start SFT, RL, and the OpenThinkerAgent-32B release). • 11 items • Updated 15 days ago • 7

upvoted a paper 1 day ago

DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation

Paper • 2606.26058 • Published 2 days ago • 56

upvoted a paper 9 days ago

FastContext: Training Efficient Repository Explorer for Coding Agents

Paper • 2606.14066 • Published 14 days ago • 91

upvoted 2 papers 16 days ago

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 21 days ago • 73

Agents' Last Exam

Paper • 2606.05405 • Published 23 days ago • 364

upvoted a paper 19 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 23 days ago • 25

upvoted a paper 29 days ago

V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts

Paper • 2603.10848 • Published Mar 11 • 16

upvoted a paper 30 days ago

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Paper • 2605.27141 • Published May 26 • 19

upvoted 6 papers about 1 month ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published May 25 • 34

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Paper • 2506.03610 • Published Jun 4, 2025 • 10

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

Paper • 2605.21467 • Published May 20 • 207

SOD: Step-wise On-policy Distillation for Small Language Model Agents

Paper • 2605.07725 • Published May 8 • 25

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published May 19 • 59

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published Apr 9 • 295

upvoted 2 collections about 1 month ago

Agent

128 items • Updated 9 days ago • 13

Papers

1 item • Updated May 9 • 1

upvoted 4 papers about 1 month ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 115

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Paper • 2605.13831 • Published May 13 • 88

Auto-Rubric as Reward: From Implicit Preferences to Explicit Multimodal Generative Criteria

Paper • 2605.08354 • Published May 8 • 23

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published May 8 • 41