view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 4 days ago • 43
Distil Efficiency Benchmarks Collection Collection of models used in the blog post www.distillabs.ai/blog/the-10x-inference-tax-you-dont-have-to-pay • 9 items • Updated 12 days ago • 3
Quantized Qwen3.5 Collection Verified models. Compatible with Transformers v5.3 and vLLM v0.16.1rc1 (nightly). Under evaluation. • 9 items • Updated 1 day ago • 9
Qwen3-MoE Collection Compressed Qwen3 MoE models with a reduced number of experts. See additional models at https://huggingface.co/bknyaz. • 9 items • Updated about 1 month ago • 3
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated 17 days ago • 130
gliner2 family Collection GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. This base model provid • 4 items • Updated Feb 10 • 38
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output Feb 7 • 22
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 66
The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated Nov 16, 2025 • 102
GLiNER-PII Collection PII detection models developed in collaboration with Wordcab • 5 items • Updated Jan 29 • 22