Instructions to use zai-org/GLM-4.5-Air-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/GLM-4.5-Air-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zai-org/GLM-4.5-Air-Base")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("zai-org/GLM-4.5-Air-Base") model = AutoModelForCausalLM.from_pretrained("zai-org/GLM-4.5-Air-Base") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use zai-org/GLM-4.5-Air-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zai-org/GLM-4.5-Air-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-4.5-Air-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/zai-org/GLM-4.5-Air-Base
- SGLang
How to use zai-org/GLM-4.5-Air-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zai-org/GLM-4.5-Air-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-4.5-Air-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zai-org/GLM-4.5-Air-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/GLM-4.5-Air-Base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use zai-org/GLM-4.5-Air-Base with Docker Model Runner:
docker model run hf.co/zai-org/GLM-4.5-Air-Base
Improve model card with paper link, system requirements, and sample usage
#1
by nielsr HF Staff - opened
This pull request significantly improves the model card for GLM-4.5-Air-Base.
Key improvements include:
- Adding a prominent link to the official Hugging Face paper page (GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models) for better discoverability. The existing ArXiv link remains.
- Integrating detailed "System Requirements" from the official GitHub repository, providing crucial information on inference and fine-tuning configurations.
- Enhancing the "Quick Start" section by:
- Including general installation instructions.
- Providing a concrete Python code snippet for text generation using the
transformerslibrary. - Incorporating detailed setup commands for
vLLMandSGLangfrom the GitHub README, making it easier for users to deploy and interact with the model directly from the Hub page.
These updates aim to provide a more comprehensive and user-friendly experience for researchers and developers on Hugging Face.
base model is not support in this way.
Hi, thanks for the reply. Feel free to adapt for the base model :)