HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness Paper • 2605.02396 • Published 2 days ago • 6
Hallucinations Undermine Trust; Metacognition is a Way Forward Paper • 2605.01428 • Published 4 days ago • 12
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models Paper • 2310.02943 • Published Oct 4, 2023 • 1
GroundSet: A Cadastral-Grounded Dataset for Spatial Understanding with Vector Data Paper • 2603.14609 • Published Mar 15 • 1
OpenTME: An Open Dataset of AI-powered H&E Tumor Microenvironment Profiles from TCGA Paper • 2604.12075 • Published 23 days ago • 1
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos Paper • 2512.16978 • Published Dec 18, 2025 • 7