view article Article Beyond LoRA: Can you beat the most popular fine-tuning technique? +2 BenjaminB, sayakpaul, hubnemo, kashif • 11 days ago • 64
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare +1 aaditya, pminervini, clefourrier • Apr 19, 2024 • 202
MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 Zero-Shot Classification • 0.3B • Updated Apr 11, 2024 • 312k • • 376
Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data Paper • 2504.02268 • Published Apr 3, 2025 • 5
view reply Good work. Can you share the following details regarding the pretraining of Supra-50M base model? GPU(s) used for pretraining Total GPU hours and cost Cloud platform (GPU) used for pretraining
view post Post 6603 We're happy to announce that we released a Reasoning tuned version of Supra-50M! SupraLabs/Supra-50M-Reasoning See translation 🔥 9 9 👍 1 1 + Reply
view post Post 9225 Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.You can run and train the model via Unsloth Studio.GGUF: unsloth/gemma-4-12b-it-GGUFGuide: https://unsloth.ai/docs/models/gemma-4 See translation 5 replies · 🔥 44 44 👍 13 13 🤗 2 2 + Reply