LLM RL - a wenzel94 Collection

wenzel94 's Collections

深度搜索Agent

LLM 不需要训练

LLM RL

updated Mar 9

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 190
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 128
PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published Dec 3, 2025 • 50
DSGym: A Holistic Framework for Evaluating and Training Data Science Agents

Paper • 2601.16344 • Published Jan 22 • 12
SkillOrchestra: Learning to Route Agents via Skill Transfer

Paper • 2602.19672 • Published Feb 23 • 58
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published Feb 26 • 45