Hunter Heidenreich's picture

Hunter Heidenreich

hheiden-roots

·

hunterheiden

AI & ML interests

None yet

Recent Activity

updated a dataset 4 days ago

rootsautomation/MUSTARD

published a dataset 4 days ago

rootsautomation/MUSTARD

updated a dataset 5 days ago

rootsautomation/MultiHiertt

View all activity

Organizations

upvoted a paper about 2 months ago

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published Mar 23 • 136

upvoted 3 papers 4 months ago

CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20, 2025 • 22

Large Language Models for Page Stream Segmentation

Paper • 2408.11981 • Published Aug 21, 2024 • 3

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published Jan 20 • 37

upvoted 3 collections 4 months ago

GutenOCR

3 items • Updated Jan 22 • 6

OCR

Data and models for optical character recognition • 6 items • Updated Jan 22 • 5

RICO

A collection of RICO screenshot-based datasets for training and evaluation. We've attempted to compile all surrounding metadata for the relevant tasks • 8 items • Updated Jan 16 • 5

upvoted a paper 4 months ago

PubMed-OCR: PMC Open Access OCR Annotations

Paper • 2601.11425 • Published Jan 16 • 12

upvoted a collection about 1 year ago

Gemma 3

All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 54 items • Updated 27 days ago • 115

upvoted a collection about 2 years ago

LLaVA-1.6

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31, 2024 • 75