ReasoningLens: Hierarchical Visualization and Diagnostic Auditing for Large Reasoning Models Paper • 2606.23404 • Published 9 days ago • 2
ReasoningLens: Hierarchical Visualization and Diagnostic Auditing for Large Reasoning Models Paper • 2606.23404 • Published 9 days ago • 2
Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination Paper • 2605.31058 • Published May 29 • 2
Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination Paper • 2605.31058 • Published May 29 • 2
LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents Paper • 2605.29559 • Published May 28 • 17
Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards Paper • 2605.14539 • Published May 14 • 8
Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards Paper • 2605.14539 • Published May 14 • 8
Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models Paper • 2604.16902 • Published Apr 18 • 6
Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models Paper • 2604.16902 • Published Apr 18 • 6
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces Paper • 2604.08362 • Published Apr 9 • 16
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces Paper • 2604.08362 • Published Apr 9 • 16
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published Mar 10 • 10
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published Mar 10 • 10