How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark Paper • 2505.18761 • Published May 24, 2025 • 1
GSM-DC Collection Investigate LLM reasoning robustness through controlled benchmark. • 15 items • Updated Nov 13, 2025 • 1
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems Paper • 2408.16293 • Published Aug 29, 2024 • 27