Essential-Web v1.0: 24T tokens of organized web data Paper • 2506.14111 • Published Jun 17, 2025 • 46
Running 3.67k The Ultra-Scale Playbook 🌌 3.67k The ultimate guide to training LLM on large GPU Clusters
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition Paper • 2403.19822 • Published Mar 28, 2024
PEEKABOO: Interactive Video Generation via Masked-Diffusion Paper • 2312.07509 • Published Dec 12, 2023 • 11
PEEKABOO: Interactive Video Generation via Masked-Diffusion Paper • 2312.07509 • Published Dec 12, 2023 • 11
ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition Paper • 2202.00758 • Published Feb 1, 2022