Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# GLM-OCR ONNX (Decoder)
|
| 2 |
+
|
| 3 |
+
ONNX export of the **decoder** of [zai-org/GLM-OCR](https://huggingface.co/zai-org/GLM-OCR). Exported with `scripts/export_glm_ocr_onnx.py` (Transformers 5.1.0, custom torch.onnx path).
|
| 4 |
+
|
| 5 |
+
## Contents
|
| 6 |
+
|
| 7 |
+
- `glm_ocr_decoder.onnx` / `glm_ocr_decoder.onnx.data` – Decoder ONNX (inputs: `decoder_input_ids`, `encoder_hidden_states`; output: `logits`).
|
| 8 |
+
- `tokenizer.json`, `tokenizer_config.json` – Tokenizer from zai-org/GLM-OCR.
|
| 9 |
+
|
| 10 |
+
## Note
|
| 11 |
+
|
| 12 |
+
The **vision encoder** was not exported (model forward requires either `input_ids` or `inputs_embeds` when called with image inputs only). To run full OCR you need encoder hidden states from another source or the original PyTorch model for the vision part.
|
| 13 |
+
|
| 14 |
+
## Usage
|
| 15 |
+
|
| 16 |
+
Load with ONNX Runtime; feed `encoder_hidden_states` (from your vision encoder or zai-org/GLM-OCR in PyTorch) and `decoder_input_ids`; get `logits` and decode with the included tokenizer.
|
| 17 |
+
|
| 18 |
+
## Source
|
| 19 |
+
|
| 20 |
+
- Base model: [zai-org/GLM-OCR](https://huggingface.co/zai-org/GLM-OCR)
|
| 21 |
+
- Export spec: See [Docs/GLM_OCR_ONNX_Export.md](https://github.com/...) in the TranslateBlue repo.
|