Instructions to use zai-org/GLM-OCR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/GLM-OCR with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="zai-org/GLM-OCR")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("zai-org/GLM-OCR") model = AutoModelForImageTextToText.from_pretrained("zai-org/GLM-OCR") - Inference
- Notebooks
- Google Colab
- Kaggle
Table regonition doesn't work well for multiple table
#28
by OumarDicko - opened
the title say it all
Ah?
not a bad idea now
this is what i like
@OumarDicko Thanks for your attention.
For documents containing multiple tables, the current plain model output may not perform optimally.
We recommend using the SDK, which includes additional parsing and post-processing to better handle complex layouts (including multiple tables).
You can find the SDK and usage instructions here:
https://github.com/zai-org/GLM-OCR
@iyuge2 Is the GLM OCR capable of table recognition across multiple pages.
I'm looking for cross table recognition and merging. If you could provide some examples, it would be really helpful.