Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated about 18 hours ago • 40
DroPE Collection Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding (https://www.arxiv.org/abs/2512.12167) • 1 item • Updated Jan 11 • 3
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8 Text Generation • 124B • Updated 3 days ago • 210k • 158
ibm-granite/granite-4.0-1b-speech Automatic Speech Recognition • 2B • Updated about 12 hours ago • 20.3k • 135
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 19 days ago • 87
ColBERT-Zero 🐶 Collection First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 14 days ago • 18
knowledgator/gliclass-instruct-large-v1.0 Text Classification • 0.4B • Updated 28 days ago • 9.28k • 35