Running 3.61k The Ultra-Scale Playbook 🌌 3.61k The ultimate guide to training LLM on large GPU Clusters
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch May 7, 2024 • 111
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated Jul 10 • 151
tpadhi1/llama-2-7b-chat-hf-finetuned-mental-health-reddit-trilok Text Generation • 7B • Updated Feb 28, 2024
openai/whisper-large-v3 Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 6.42M • • 5.24k