mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 52
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published Jan 20 • 25
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 14 days ago • 87
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published about 1 month ago • 50
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 150
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 20 days ago • 480
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation Paper • 2601.15369 • Published Jan 21 • 21
OpenVision 3 Collection A Family of Unified Visual Encoder with Unified Visual Representation. • 4 items • Updated Jan 27 • 2