Think Just Enough: Sequence-Level Entropy as a Confidence Signal for LLM Reasoning Paper • 2510.08146 • Published Oct 9, 2025 • 1
The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute Paper • 2511.02309 • Published Nov 4, 2025 • 4
EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages Paper • 2603.09678 • Published Mar 10 • 1
ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads? Paper • 2602.19594 • Published Feb 23 • 2
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts Paper • 2601.03315 • Published Jan 6 • 6
Small-Gain Nash: Certified Contraction to Nash Equilibria in Differentiable Games Paper • 2512.06791 • Published Dec 7, 2025 • 4