A^3-Bench: Benchmarking Memory-Driven Scientific Reasoning via Anchor and Attractor Activation Paper • 2601.09274 • Published 12 days ago • 83
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published Dec 16, 2025 • 42
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling Paper • 2512.04784 • Published Dec 2, 2025 • 25
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks Paper • 2512.02580 • Published Dec 2, 2025 • 28
ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models Paper • 2510.06014 • Published Oct 7, 2025 • 10
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 72
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 97
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models Paper • 2510.13626 • Published Oct 15, 2025 • 46
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning Paper • 2510.13809 • Published Oct 15, 2025 • 38
LightReasoner Models Collection https://arxiv.org/abs/2510.07962 • 3 items • Updated Oct 19, 2025 • 5
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? Paper • 2510.08189 • Published Oct 9, 2025 • 27