--- license: apache-2.0 tags: - gguf - ocr - scene-text - parseq - crispembed base_model: baudm/parseq --- # PARSeq — Scene Text Recognition (GGUF) GGUF conversions of [PARSeq](https://github.com/baudm/parseq) (ECCV 2022) for use with [CrispEmbed](https://github.com/CrispStrobe/CrispEmbed). PARSeq is a scene text recognition model that reads text from natural images (signs, labels, documents). It recognizes 94 printable ASCII characters (digits, letters, punctuation). ## Architecture - **Encoder**: 12-layer pre-LN ViT (patch 4×8, input 32×128 RGB, 128 tokens, GELU FFN) - **Decoder**: 1-layer two-stream Transformer (XLNet-style position queries + context self-attention, then cross-attention to encoder memory) - **Head**: Linear → 95 classes (94 printable ASCII chars + EOS) - **Inference**: Autoregressive greedy decode (max 25 characters) ## Variants | File | Variant | Params | Size | Notes | |------|---------|--------|------|-------| | `parseq-f32.gguf` | Base | 24M | 91 MB | Full precision | | `parseq-q8_0.gguf` | Base | 24M | 24 MB | Best quantized | | `parseq-q4_k.gguf` | Base | 24M | 13 MB | Smallest base | | `parseq-tiny-f16.gguf` | Tiny | 6M | 12 MB | Half precision | | `parseq-tiny-q8_0.gguf` | Tiny | 6M | 6 MB | Smallest overall | All quantization levels produce identical output on test images. ## Usage ```bash # CLI crispembed -m parseq-q8_0.gguf --ocr image.png # Auto-download crispembed -m parseq --auto-download --ocr image.png ``` ```python from crispembed import CrispMathOcr ocr = CrispMathOcr("parseq-q8_0.gguf") text = ocr.recognize("sign.png") ``` ## Benchmark (94-char, PARSeq-base) | Dataset | Accuracy | |---------|----------| | IIIT5k | 99.1% | | SVT | 97.9% | | IC13-1015 | 98.1% | | IC15-2077 | 89.2% | | SVTP | 96.9% | | CUTE80 | 98.6% | ## Source - Paper: [Scene Text Recognition with Permuted Autoregressive Sequence Models](https://arxiv.org/abs/2207.06966) (ECCV 2022) - Code: [baudm/parseq](https://github.com/baudm/parseq) (Apache-2.0) - Converted with `models/convert-parseq-to-gguf.py` from CrispEmbed