Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 77 items • Updated 1 day ago • 173
view article Article Cohere on Hugging Face Inference Providers 🔥 +5 reach-vb, burtenshaw, merve, celinah, alexrs, julien-c, sbrandeis • Apr 16, 2025 • 129
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach oopere • Nov 24, 2024 • 20
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7, 2025 • 202
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 10 days ago • 97