Update README.md
Browse files
README.md
CHANGED
|
@@ -2,4 +2,36 @@
|
|
| 2 |
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- FireRedTeam/FireRed-OCR
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- FireRedTeam/FireRed-OCR
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
pipeline_tag: image-text-to-text
|
| 8 |
+
library_name: transformers
|
| 9 |
+
tags:
|
| 10 |
+
- text-generation-inference
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# **FireRed-OCR-GGUF**
|
| 14 |
+
|
| 15 |
+
> FireRed-OCR from FireRedTeam is a specialized framework that transforms general Large Vision-Language Models into pixel-precise structural document parsing experts, tackling "Structural Hallucination" issues like disordered rows and invented formulas through a shift to "structural engineering" paradigms, achieving SOTA 92.94% on OmniDocBench v1.5—vastly outperforming DeepSeek-OCR 2, OCRVerse, and giants like Gemini-3.0 Pro or Qwen3-VL-235B. Its key innovations include Format-Constrained GRPO (Group Relative Policy Optimization) for enforcing syntactic validity (no unclosed tables or invalid LaTeX), a "Geometry + Semantics" data factory with geometric clustering and multi-dimensional tagging for balanced long-tail layouts, and a progressive pipeline: multi-task pre-alignment for spatial grounding, specialized SFT for standardized full-image Markdown output, and GRPO self-correction via RL. Demonstrating in-the-wild robustness on FireRedBench complex layouts over traditional systems like PaddleOCR, it excels in high-fidelity parsing of tables, equations, forms, and multi-column documents for real-world automation.
|
| 16 |
+
|
| 17 |
+
## Model Files
|
| 18 |
+
|
| 19 |
+
| File Name | Quant Type | File Size | File Link |
|
| 20 |
+
| - | - | - | - |
|
| 21 |
+
| FireRed-OCR.BF16.gguf | BF16 | 3.45 GB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.BF16.gguf) |
|
| 22 |
+
| FireRed-OCR.F16.gguf | F16 | 3.45 GB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.F16.gguf) |
|
| 23 |
+
| FireRed-OCR.F32.gguf | F32 | 6.89 GB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.F32.gguf) |
|
| 24 |
+
| FireRed-OCR.Q8_0.gguf | Q8_0 | 1.83 GB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.Q8_0.gguf) |
|
| 25 |
+
| FireRed-OCR.mmproj-bf16.gguf | mmproj-bf16 | 823 MB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.mmproj-bf16.gguf) |
|
| 26 |
+
| FireRed-OCR.mmproj-f16.gguf | mmproj-f16 | 823 MB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.mmproj-f16.gguf) |
|
| 27 |
+
| FireRed-OCR.mmproj-f32.gguf | mmproj-f32 | 1.63 GB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.mmproj-f32.gguf) |
|
| 28 |
+
| FireRed-OCR.mmproj-q8_0.gguf | mmproj-q8_0 | 445 MB | [Download](https://huggingface.co/prithivMLmods/FireRed-OCR-GGUF/blob/main/FireRed-OCR.mmproj-q8_0.gguf) |
|
| 29 |
+
|
| 30 |
+
## Quants Usage
|
| 31 |
+
|
| 32 |
+
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
|
| 33 |
+
|
| 34 |
+
Here is a handy graph by ikawrakow comparing some lower-quality quant
|
| 35 |
+
types (lower is better):
|
| 36 |
+
|
| 37 |
+

|