Running on CPU Upgrade Featured 2.69k The Smol Training Playbook 📚 2.69k The secrets to building world-class LLMs
GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver Paper • 2510.17699 • Published Oct 20 • 24
Yambda-5B -- A Large-Scale Multi-modal Dataset for Ranking And Retrieval Paper • 2505.22238 • Published May 28 • 3
ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models Paper • 2505.22569 • Published May 28 • 56
Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts Paper • 2506.05229 • Published Jun 5 • 38