view article Article Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments shanchen • Jan 13 • 10
view article Article Budget Alignment: Making Models Reason in the User’s Language shanchen • Nov 4, 2025 • 11