Visual Question Answering
Transformers
Safetensors
internvl_chat
feature-extraction
custom_code
8-bit precision
bitsandbytes
Instructions to use failspy/InternVL-Chat-V1-5-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use failspy/InternVL-Chat-V1-5-8bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="failspy/InternVL-Chat-V1-5-8bit", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("failspy/InternVL-Chat-V1-5-8bit", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
How to use this model?
#1
by bingw5 - opened
I can see there is example code to run the model. But that's for original model. Do I need to modify any parameters or lines to run quantized model?
Ah, that example code is from the original model card.
The example code of how to run this model is very similar but pointed to this repo. See here: https://huggingface.co/failspy/InternVL-Chat-V1-5-8bit/blob/main/example_inference.py
Running that will be running the quantized model.
I ran into error:
TypeError(\"internvl_chat isn't supported yet.\")
Do you know what's the root cause?