Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models Paper • 2505.23715 • Published May 29, 2025 • 2
StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following Paper • 2502.14494 • Published Feb 20, 2025 • 15