Instructions to use zai-org/glm-4v-9b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/glm-4v-9b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zai-org/glm-4v-9b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Using `low_cpu_mem_usage=True` or a `device_map` requires Accelerate: `pip install accelerate`
I am using the code in the model car in colab with a100 gpu.
I have run pip install accelerate successfully but still I get the error message in the subject of this discussion:
Using low_cpu_mem_usage=True or a device_map requires Accelerate: pip install accelerate
This is an error that I keep getting for other models too. Somebody help me please.
import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
tokenizer = AutoTokenizer.from_pretrained("THUDM/glm-4v-9b", trust_remote_code=True)
query = 'describe the image'
image = Image.open("/content/drive/MyDrive/car.jpg").convert('RGB')
inputs = tokenizer.apply_chat_template([{"role": "user", "image": image, "content": query}],
add_generation_prompt=True, tokenize=True, return_tensors="pt",
return_dict=True) # chat mode
inputs = inputs.to(device)
model = AutoModelForCausalLM.from_pretrained(
"THUDM/glm-4v-9b",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
trust_remote_code=True
).to(device).eval()
gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
with torch.no_grad():
outputs = model.generate(**inputs, **gen_kwargs)
outputs = outputs[:, inputs['input_ids'].shape[1]:]
print(tokenizer.decode(outputs[0]))
just run pip install accelerate in command and install this package
in colab and probably in jupyter notebooks in general, restarting the kernel after pip install accelerate solves the problem
https://stackoverflow.com/questions/76902752/importerror-using-low-cpu-mem-usage-true-or-a-device-map-requires-accelerat