hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF Text Generation • 35B • Updated Apr 19 • 110k • 273
Scalable Training of Mixture-of-Experts Models with Megatron Core Paper • 2603.07685 • Published Mar 8 • 3
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29, 2025 • 81