bharatgenai/Param2-17B-A2.4B-Thinking Text Generation • 17B • Updated about 22 hours ago • 2.12k • 53
Running Featured 66 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 66 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 23 days ago • 484
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 3 days ago • 30
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 49
Running 3.74k The Ultra-Scale Playbook 🌌 3.74k The ultimate guide to training LLM on large GPU Clusters