Instructions to use zai-org/chatglm-6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/chatglm-6b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zai-org/chatglm-6b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Problem: query_key_layer_scaling_coeff = float(layer_id + 1)
#99
by Kissacat - opened
Hello, thank you for your work!!
I've got a problem when I run ChatGLM with LoRa:
File "/home//Hongwei/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/8b7d33596d18c5e83e2da052d05ca4db02e60620/modeling_chatglm.py", line 267, in attention_fn
query_key_layer_scaling_coeff = float(layer_id + 1)
RuntimeError: CUDA error: no kernel image is available for execution on the device
I found this is because tensor variable (layer_id) is added by non-tensor (1). So I change layer_id to .cpu().numpy(). But I DO not know why layer_id is a tensor? Is there something wrong in the code?
