DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 90
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains Paper • 2510.25409 • Published Oct 29, 2025 • 4
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent Paper • 2510.19386 • Published Oct 22, 2025 • 9
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Paper • 2510.09116 • Published Oct 10, 2025 • 96
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1, 2025 • 107
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors Paper • 2409.15273 • Published Sep 23, 2024 • 12