MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published 9 days ago • 47
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published 6 days ago • 44
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 20 days ago • 104 • 7
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 20 days ago • 104
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 20 days ago • 104
ATTS: Asynchronous Test-Time Scaling via Conformal Prediction Paper • 2509.15148 • Published Sep 18, 2025
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models Paper • 2310.10180 • Published Oct 16, 2023 • 1
D2O: Dynamic Discriminative Operations for Efficient Generative Inference of Large Language Models Paper • 2406.13035 • Published Jun 18, 2024 • 3
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation Paper • 2410.02719 • Published Oct 3, 2024 • 1
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective Paper • 2410.03090 • Published Oct 4, 2024 • 1
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning Paper • 2412.13626 • Published Dec 18, 2024
AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations Paper • 2311.13538 • Published Nov 22, 2023