File size: 1,215 Bytes
bb792b9
 
 
 
 
 
 
 
 
 
3696022
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: mit
pipeline_tag: image-to-text
tags:
  - onnx
  - ocr
  - glm-ocr
base_model: zai-org/GLM-OCR
---

# GLM-OCR ONNX (Decoder)

ONNX export of the **decoder** of [zai-org/GLM-OCR](https://huggingface.co/zai-org/GLM-OCR). Exported with `scripts/export_glm_ocr_onnx.py` (Transformers 5.1.0, custom torch.onnx path).

## Contents

- `glm_ocr_decoder.onnx` / `glm_ocr_decoder.onnx.data` – Decoder ONNX (inputs: `decoder_input_ids`, `encoder_hidden_states`; output: `logits`).
- `tokenizer.json`, `tokenizer_config.json` – Tokenizer from zai-org/GLM-OCR.

## Note

The **vision encoder** was not exported (model forward requires either `input_ids` or `inputs_embeds` when called with image inputs only). To run full OCR you need encoder hidden states from another source or the original PyTorch model for the vision part.

## Usage

Load with ONNX Runtime; feed `encoder_hidden_states` (from your vision encoder or zai-org/GLM-OCR in PyTorch) and `decoder_input_ids`; get `logits` and decode with the included tokenizer.

## Source

- Base model: [zai-org/GLM-OCR](https://huggingface.co/zai-org/GLM-OCR)
- Export spec: See [Docs/GLM_OCR_ONNX_Export.md](https://github.com/...) in the TranslateBlue repo.