KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 23 days ago • 21
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information Paper • 2511.22176 • Published Nov 27, 2025 • 5
Finding Dori: Memorization in Text-to-Image Diffusion Models Is Less Local Than Assumed Paper • 2507.16880 • Published Jul 22, 2025 • 7
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models Paper • 2505.22232 • Published May 28, 2025 • 18