Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs Paper • 2604.10480 • Published 4 days ago • 17
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 8 days ago • 309
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods Paper • 2601.21821 • Published Jan 29 • 62
OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value Paper • 2512.14051 • Published Dec 16, 2025 • 47
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5, 2025 • 23
GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models Paper • 2511.11134 • Published Nov 14, 2025 • 33