FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 73
WeDetect: Fast Open-Vocabulary Object Detection as Retrieval Paper • 2512.12309 • Published Dec 13, 2025 • 3
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 6 days ago • 60
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG 11 days ago • 59
Nemotron RAG Collection Set of tools to build retrieval-augmented generation (RAG) systems, improve search and ranking accuracy, and extract structured data from complex do • 11 items • Updated 5 days ago • 64
Perception Encoder Collection OpenCLIP (PE Core image + text) and timm PE Core, Spatial, Lang (ViT only) weights. NOTE: These weights do not work with original modeling code. • 19 items • Updated Sep 19, 2025 • 7
UM-Text: A Unified Multimodal Model for Image Understanding Paper • 2601.08321 • Published 12 days ago • 8
Health AI Developer Foundations (HAI-DEF) Collection Groups models released for use in health AI by Google. Read more about HAI-DEF at http://goo.gle/hai-def • 22 items • Updated 13 days ago • 173