view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 389
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3, 2025 • 225
Training Sparse Mixture Of Experts Text Embedding Models Paper • 2502.07972 • Published Feb 11, 2025 • 10
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 6 items • Updated Mar 2 • 166
Running 3.85k The Ultra-Scale Playbook 🌌 3.85k The ultimate guide to training LLM on large GPU Clusters
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 145
view article Article Training and Finetuning Embedding Models with Sentence Transformers tomaarsen • May 28, 2024 • 275
Running on CPU Upgrade Agents 606 GAIA Leaderboard 🦾 606 Submit and score your model on the GAIA benchmark