pix2struct-GGUF / README.md
cstr's picture
Add model card
6c4054b verified
|
Raw
History Blame Contribute Delete
407 Bytes
metadata
license: apache-2.0
tags:
  - gguf
  - document-understanding
  - chart-qa
  - pix2struct
  - image-to-text
pipeline_tag: image-to-text

Pix2Struct Base (GGUF)

Variable-resolution image-to-text model for documents, charts, tables. 282M params (12L encoder + 12L decoder), Apache-2.0.

Encoder parity: cos=1.000000 vs HuggingFace. Source: google/pix2struct-base.