Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper β’ 2604.22748 β’ Published 13 days ago β’ 223
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 29 days ago β’ 61
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper β’ 2602.05400 β’ Published Feb 5 β’ 353
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 β’ 89
view article Article Introducing Daggr: Chain apps programmatically, inspect visually +3 Jan 29 β’ 107
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper β’ 2601.17058 β’ Published Jan 22 β’ 190
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper β’ 2601.16746 β’ Published Jan 23 β’ 91
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper β’ 2601.09667 β’ Published Jan 14 β’ 92
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper β’ 2601.06943 β’ Published Jan 11 β’ 214
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper β’ 2512.04324 β’ Published Dec 3, 2025 β’ 159
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper β’ 2512.02556 β’ Published Dec 2, 2025 β’ 267