Instructions to use zai-org/glm-4-9b-chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/glm-4-9b-chat with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zai-org/glm-4-9b-chat", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
openai_api_server.py可以多卡启动吗
#91
by RoboTerh - opened
你这个多张显卡是多少呢,如果你显存不够,建议用transformers的底座启动,能让没张卡平均负载在到13G左右,但是调用token多了以后kvcache仍然很占用显存。
For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.
