view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 315
Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation Paper • 2605.00529 • Published May 1 • 6
Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN Paper • 2508.06647 • Published Aug 8, 2025 • 16
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family lightonai • Jan 19 • 96
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 244
Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model Paper • 2104.09617 • Published Apr 19, 2021 • 2
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language +4 davidberenstein1957, sdiazlor, Leiyre, dvilasuero, Ameeeee, burtenshaw • Dec 16, 2024 • 163
BERT release Collection Regroups the original BERT models released by the Google team. Except for the models marked otherwise, the checkpoints support English. • 8 items • Updated Mar 12 • 44