Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism Paper • 2606.00408 • Published 6 days ago • 53
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation Paper • 2605.25378 • Published 10 days ago • 58
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published 14 days ago • 30
NSF-SciFy: Mining the NSF Awards Database for Scientific Claims Paper • 2503.08600 • Published 10 days ago • 4
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 22 days ago • 270
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 243
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 102