metadata
license: apache-2.0
tags:
- gguf
- document-understanding
- chart-qa
- pix2struct
- image-to-text
pipeline_tag: image-to-text
Pix2Struct Base (GGUF)
Variable-resolution image-to-text model for documents, charts, tables. 282M params (12L encoder + 12L decoder), Apache-2.0.
Encoder parity: cos=1.000000 vs HuggingFace. Source: google/pix2struct-base.