Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification Paper • 2601.15808 • Published 4 days ago • 12
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems Paper • 2601.11004 • Published 10 days ago • 29
AutoGraph-R1 Collection Directly Optimizing Knowledge Graph Construction for RAG using Reinforcement Learning • 11 items • Updated Oct 24, 2025 • 2
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents Paper • 2510.07172 • Published Oct 8, 2025 • 28
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 94
From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery Paper • 2505.13259 • Published May 19, 2025 • 1