Yuheng Wu's picture

4 3

Yuheng Wu

joooelw

·

AI & ML interests

None yet

Organizations

authored 4 papers 7 months ago

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Paper • 2503.23145 • Published Mar 29, 2025 • 35

SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas

Paper • 2505.14615 • Published May 20, 2025 • 1

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Paper • 2505.17348 • Published May 22, 2025 • 1

On the Role of Temperature Sampling in Test-Time Scaling

Paper • 2510.02611 • Published Oct 2, 2025 • 1