ParoQuant Collection Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 20 items • Updated 3 days ago • 25
APEX Quants (GGUF) Collection MoE models quantized with the APEX Quantization technique ( https://github.com/mudler/apex-quant ) • 27 items • Updated 9 days ago • 87
Trinity Collection Collection of Arcee AI models in the Trinity family • 14 items • Updated Mar 25 • 30
✨SimpleChat Collection The SimpleChat series represents our new exploration into Non-Chain-of-Thought (Non-CoT) models. Designed to be concise, rational, and empathetic. • 4 items • Updated Mar 11 • 3
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1, 2025 • 349
view article Article From Zero to AI: Build Your First Language Model in 5 Minutes with Google's Gemma Jul 31, 2025 • 5
view article Article Automated Discovery of High-Performance GPU Kernels with OpenEvolve Jun 27, 2025 • 26
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5, 2025 • 36
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated Mar 25, 2025 • 66
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Dec 31, 2025 • 127