Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 28 days ago • 7
nvidia/gpt-oss-120b-Eagle3-long-context Text Generation • 0.2B • Updated about 1 month ago • 2.82k • 54
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
Running 3.64k The Ultra-Scale Playbook 🌌 3.64k The ultimate guide to training LLM on large GPU Clusters
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 Jul 23, 2024 • 241