Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12, 2025 • 76
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published Dec 29, 2025 • 45
Olmo 3.1 Collection The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 48
Bolmo Collection Artifacts for the Bolmo release: https://allenai.org/papers/bolmo. • 4 items • Updated Dec 23, 2025 • 12
Jan-v2-VL Collection Jan-v2-VL: a family of VLM focused on reliable, many-step task execution. • 8 items • Updated Dec 30, 2025 • 40
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 94
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 158
Granite Quantized Models Collection Quantized versions of IBM Granite models. Licensed under the Apache 2.0 license. • 36 items • Updated 10 days ago • 32
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1, 2025 • 336