TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization Paper • 2601.16480 • Published 4 days ago • 49
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 170
Patient-Similarity Cohort Reasoning in Clinical Text-to-SQL Paper • 2601.09876 • Published 12 days ago • 6
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published 5 days ago • 85
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding Paper • 2601.14724 • Published 6 days ago • 71
FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs Paper • 2601.13836 • Published 7 days ago • 34
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published 11 days ago • 63
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published 23 days ago • 56
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 59
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 213
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning Paper • 2510.14942 • Published Oct 16, 2025 • 3
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27, 2025 • 55
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models Paper • 2510.13626 • Published Oct 15, 2025 • 46
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research Paper • 2507.13300 • Published Jul 17, 2025 • 20
Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers Paper • 2507.06223 • Published Jul 8, 2025 • 14
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks Paper • 2507.01001 • Published Jul 1, 2025 • 46
SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification Paper • 2506.15569 • Published Jun 18, 2025 • 12