Document Question Answering
Transformers
Safetensors
chatglm
feature-extraction
text-generation-inference
custom_code
4-bit precision
bitsandbytes
Instructions to use nikravan/glm-4vq with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nikravan/glm-4vq with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("document-question-answering", model="nikravan/glm-4vq", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nikravan/glm-4vq", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Pipeline not working
#3
by lucabot - opened
Hi, when I try to use the model through the pipeline I get this error:
ValueError: Unrecognized configuration class <class 'transformers_modules.nikravan.glm-4vq.e441477369dc88ad0ab225d9cd69db0291e2dc7b.configuration_chatglm.ChatGLMConfig'> for this kind of AutoModel: AutoModelForDocumentQuestionAnswering.
Model type should be one of LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config.
Any idea what it could be?
Also, is there any way to run this locally loading the model directly on an 8GB VRAM GPU? I tried llm_int8_enable_fp32_cpu_offload = true but it throws:
ValueError: Blockwise quantization only supports 16/32-bit floats, but got torch.uint8
Thanks in advance.