CommonForms: A Large, Diverse Dataset for Form Field Detection Paper • 2509.16506 • Published Sep 20, 2025 • 22
GutenOCR: A Grounded Vision-Language Front-End for Documents Paper • 2601.14490 • Published 12 days ago • 36
RICO Collection A collection of RICO screenshot-based datasets for training and evaluation. We've attempted to compile all surrounding metadata for the relevant tasks • 8 items • Updated 17 days ago • 5
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 55 items • Updated 11 days ago • 104