Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow Paper • 2601.14243 • Published 10 days ago • 19
Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification Paper • 2601.15808 • Published 8 days ago • 19
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper • 2601.16746 • Published 7 days ago • 84
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation Paper • 2601.17761 • Published 5 days ago • 13
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published 5 days ago • 21
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published 4 days ago • 35
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 2 days ago • 16
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 2 days ago • 109
FlowAct-R1: Towards Interactive Humanoid Video Generation Paper • 2601.10103 • Published 15 days ago • 70
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 17 days ago • 143
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published 10 days ago • 45
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning Paper • 2601.11141 • Published 14 days ago • 20
Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning Paper • 2601.14750 • Published 9 days ago • 17
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model Paper • 2601.15892 • Published 8 days ago • 53