MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published 12 days ago • 49
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era Paper • 2601.07526 • Published 18 days ago • 23
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 97
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 102
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26, 2025 • 65
UltraIF series Collection Open-Sourced model and data for ULTRAIF: Advancing Instruction Following from the Wild. • 6 items • Updated Apr 3, 2025 • 3
Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding Paper • 2506.07434 • Published Jun 9, 2025 • 7
TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios Paper • 2505.12891 • Published May 19, 2025 • 10
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation Paper • 2503.06680 • Published Mar 9, 2025 • 20