APEX Quants (GGUF) Collection MoE models quantized with the APEX Quantization technique ( https://github.com/mudler/apex-quant ) • 28 items • Updated 1 day ago • 93
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 53
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 33 items • Updated Mar 2 • 59
Granite 4.0 Language Models Collection Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 11 items • Updated 12 days ago • 221
Falcon Edge series Collection A series of powerful, universal and fine-tunable small Language Models • 8 items • Updated 19 days ago • 25
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 192
view article Article Comparing sub 50GB Llama 4 Scout quants (KLD/Top P) bartowski • Apr 9, 2025 • 45
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 70 items • Updated 19 days ago • 272
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1, 2025 • 63
Granite Experiments Collection Experimental projects under consideration for the Granite family. • 11 items • Updated 10 days ago • 18
Granite 3.3 Collection Language models with improved reasoning and instruction-following capabilities. • 4 items • Updated 12 days ago • 46
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization Paper • 2502.02631 • Published Feb 4, 2025 • 4
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 169