DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 472
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20, 2025 • 157
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 95
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models +1 Jun 24, 2024 • 205
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 105