Nemotron v3 Pre-Training Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 3 days ago • 6
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. • 7 items • Updated 3 days ago • 49
zai-org/AutoGLM-Phone-9B-Multilingual Image-Text-to-Text • 934k • Updated 17 days ago • 10.4k • • 206
view article Article Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time Feb 18 • 35
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16 • 273
DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 2 days ago • 260