BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published 5 days ago • 54
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens Paper • 2603.02138 • Published 8 days ago • 137
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 29 days ago • 261
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models Paper • 2602.10934 • Published 27 days ago • 49
Can Deep Research Agents Find and Organize? Evaluating the Synthesis Gap with Expert Taxonomies Paper • 2601.12369 • Published Jan 18 • 4
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 29 days ago • 156
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents Paper • 2602.02196 • Published Feb 2 • 35
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published Jan 16 • 65
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published Jan 12 • 28
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 59
ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models Paper • 2510.06014 • Published Oct 7, 2025 • 10
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 241
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29, 2025 • 78
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27, 2025 • 56