Running 3.85k The Ultra-Scale Playbook 🌌 3.85k The ultimate guide to training LLM on large GPU Clusters
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5 Sentence Similarity • Updated Mar 13, 2025 • 1.13k • 65
Paused Agents 36 Transformer Calculator 📊 36 Calculate memory, parameters, and FLOPs for transformer models