view article Article Reproducing and Validating Distributed Muon 🐢✨: A Practical Verification of Communication Efficiency Claims bird-of-paradise • Dec 12, 2025 • 2
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 6 days ago • 149
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 227
Leaderboards for Arabic Collection A collection for all leaderboards related to the Arabic Language. • 5 items • Updated Dec 9, 2025 • 8
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 218
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated Jun 6, 2024 • 248
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 70 items • Updated Apr 22 • 272
EAGLE3 Collection The collection of eagle3 series models for Qwen3 and Hunyuan. • 15 items • Updated Jan 13 • 5
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 96 items • Updated 8 days ago • 636
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 368
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 493