MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome Paper • 2603.28407 • Published 6 days ago • 62
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells Paper • 2603.25240 • Published 10 days ago • 75
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 20 days ago • 184
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published 20 days ago • 148
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published Mar 5 • 56
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72
R1-Fuzz: Specializing Language Models for Textual Fuzzing via Reinforcement Learning Paper • 2509.20384 • Published Sep 21, 2025 • 2
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published Oct 28, 2025 • 72
AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis Paper • 2510.24695 • Published Oct 28, 2025 • 24
AgentFold: Long-Horizon Web Agents with Proactive Context Management Paper • 2510.24699 • Published Oct 28, 2025 • 72
AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis Paper • 2510.24695 • Published Oct 28, 2025 • 24
Sparser Block-Sparse Attention via Token Permutation Paper • 2510.21270 • Published Oct 24, 2025 • 25
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6, 2025 • 22
R1-Fuzz: Specializing Language Models for Textual Fuzzing via Reinforcement Learning Paper • 2509.20384 • Published Sep 21, 2025 • 2