view article Article Argunauts Update: Learning Formal Argument Analysis with RLVF and HIRPO Dec 2, 2025 ⢠1
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation Paper ⢠2511.15958 ⢠Published Nov 20, 2025 ⢠1
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper ⢠2511.04662 ⢠Published Nov 6, 2025 ⢠35
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 ⢠147