On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Paper • 2603.22117 • Published 1 day ago • 16
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 1 day ago • 88
Alignment Makes Language Models Normative, Not Descriptive Paper • 2603.17218 • Published 7 days ago • 46
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding Paper • 2603.13366 • Published 15 days ago • 93
WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation Paper • 2603.15132 • Published 9 days ago • 35
Unified Spatio-Temporal Token Scoring for Efficient Video VLMs Paper • 2603.18004 • Published 6 days ago • 12
Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context Paper • 2603.15653 • Published 18 days ago • 11
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 5 days ago • 56
Flash-KMeans: Fast and Memory-Efficient Exact K-Means Paper • 2603.09229 • Published 15 days ago • 79
Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published 22 days ago • 188
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 18 days ago • 114
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 19 days ago • 34
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory Paper • 2603.04257 • Published 20 days ago • 19
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 21 days ago • 100
huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated Image-Text-to-Text • 36B • Updated 23 days ago • 68.1k • 230
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 21 days ago • 12