view article Article We’re open-sourcing our text-to-image model and the process behind it Nov 12, 2025 • 81
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation Paper • 2411.19331 • Published Nov 28, 2024 • 5
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20, 2025 • 157
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 Sep 4, 2025 • 272
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 776
view article Article Introducing EuroBERT: A High-Performance Multilingual Encoder Model Mar 10, 2025 • 146
distil-large-v3.5 Collection This collection contains the model repositories for distil-large-v3.5, which provides support for the most popular Whisper libraries. • 5 items • Updated Mar 25, 2025 • 9
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 9 items • Updated Dec 23, 2025 • 23
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20, 2025 • 96