Instructions to use BlcaCola/AutoGLM-Phone-9B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="BlcaCola/AutoGLM-Phone-9B-GGUF", filename="AutoGLM-Phone-9B-F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BlcaCola/AutoGLM-Phone-9B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BlcaCola/AutoGLM-Phone-9B-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Ollama
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with Ollama:
ollama run hf.co/BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Unsloth Studio new
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for BlcaCola/AutoGLM-Phone-9B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for BlcaCola/AutoGLM-Phone-9B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for BlcaCola/AutoGLM-Phone-9B-GGUF to start chatting
- Docker Model Runner
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with Docker Model Runner:
docker model run hf.co/BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Lemonade
How to use BlcaCola/AutoGLM-Phone-9B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull BlcaCola/AutoGLM-Phone-9B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.AutoGLM-Phone-9B-GGUF-Q4_K_M
List all available models
lemonade list
AutoGLM-Phone-9B GGUF Quantized Model Collection/AutoGLM-Phone-9B GGUF ้ๅๆจกๅ้ๅ
Congratulations! This is the most complete and fully usable collection of AutoGLM-Phone-9B model GGUF quantized versions you can find.๐๐๐
ๆญๅไฝ ๏ผ่ฟๆฏไฝ ่ฝๆพๅฐๆๅฎๆด๏ผๅนถไธ็ปๅฏนๅฏ็จ็ AutoGLM-Phone-9B ๆจกๅ GGUF ้ๅ็ๆฌ้ๅใ๐๐๐
Model Introduction/ๆจกๅ็ฎไป
Phone Agent is a mobile intelligent assistant framework built on AutoGLM,
capable of understanding smartphone screens through multimodal perception and executing automated operations to complete tasks.
AutoGLM-Phone-9B ๆฏๅบไบ GLM-4V-9B ็ๅคๆจกๆ่ง่ง่ฏญ่จๆจกๅ๏ผไธ้จ้ๅฏนๆๆบ่ชๅจๅๅบๆฏ่ฟ่กไบไผๅใ่ฏฅๆจกๅ่ฝๅค็่งฃๆๆบๅฑๅนๆชๅพๅนถ็ๆ็ธๅบ็ๆไฝๆไปคใ
โ ๏ธPlease note! This is a multimodal vision language model,
so in addition to the model itself, you also need the mmproj file. Please be sure to download this file for use!
โ ๏ธ่ฏทๆณจๆ๏ผ่ฟๆฏๅคๆจกๆ่ง่ง่ฏญ่จๆจกๅ๏ผๆไปฅ้คไบๆจกๅๆฌ่บซ๏ผไฝ ่ฟ้่ฆmmprojๆไปถ๏ผ่ฏทๅกๅฟ
ไธ่ฝฝ่ฟไธชๆไปถไธ่ตทไฝฟ็จ๏ผ
Available quantization versions/ๅฏ็จ็้ๅ็ๆฌ
| Quantization Type | Size | Memory Requirement | Notes | Download Link |
|---|---|---|---|---|
| Q2_K | 3.73 GB | ~4 GB | Not recommended ไธๆจ่ | Download |
| Q3_K_S | 4.28 GB | ~5 GB | Not recommended ไธๆจ่ | Download |
| Q3_K_M | 4.63 GB | ~5 GB | Lower quality ่ดจ้่พไฝ | Download |
| Q3_K_L | 4.84 GB | ~6 GB | Lower quality ่ดจ้่พไฝ | Download |
| Q4_0 | 5.08 GB | ~6 GB | Minimum available ๆไฝๅฏ็จ | Download |
| Q4_1 | 5.60 GB | ~6 GB | Fast, recommended ๅฟซ้๏ผๆจ่ | Download |
| Q4_K_S | 5.36 GB | ~6 GB | Fast, recommended ๅฟซ้๏ผๆจ่ | Download |
| Q4_K_M | 5.74 GB | ~7 GB | โญMost Recommended, balanced ๆๆจ่๏ผๅนณ่กกโญ | Download |
| Q5_0 | 6.11 GB | ~7 GB | Not recommended ไธๆจ่ | Download |
| Q5_1 | 6.62 GB | ~8 GB | Not recommended ไธๆจ่ | Download |
| Q5_K_S | 6.24 GB | ~7 GB | Good quality ่ดจ้ไธ้ | Download |
| Q5_K_M | 6.57 GB | ~8 GB | Good quality ่ดจ้ไธ้ | Download |
| Q6_K | 7.70 GB | ~9 GB | Very good quality ่ดจ้้ๅธธๅฅฝ | Download |
| Q8_0 | 9.31 GB | ~11 GB | โญFast, best quality ๅฟซ้๏ผ่ดจ้ๆๅฅฝโญ | Download |
| F16 | 17.52 GB | ~20 GB | 16 bpw, overkill 16 bpw๏ผ่ฟ้ | Download |
Quick Start/ๅฟซ้ๅผๅง
Using llama.cpp/ไฝฟ็จ llama.cpp
# Download the model and visual projector
# ไธ่ฝฝๆจกๅๅ่ง่งๆๅฝฑๅจ
wget https://huggingface.co/BlcaCola/AutoGLM-Phone-9B-GGUF.gguf/resolve/main/AutoGLM-Phone-9B-Q8_0.gguf
wget https://huggingface.co/BlcaCola/AutoGLM-Phone-9B-GGUF.gguf/resolve/main/AutoGLM-Phone-9B-mmproj.gguf
# Start Server
# ๅฏๅจๆๅกๅจ
./llama-server -m AutoGLM-Phone-9B-Q8_0.gguf --mmproj AutoGLM-Phone-9B-mmproj.gguf --host 0.0.0.0 --port 8080
Performance Comparison/ๆง่ฝๅฏนๆฏ
Here is a chart by ikawrakow comparing the performance levels of partially quantized models (below Q5):
่ฟ้ๆไธๅผ ikawrakow ็ๅพ่กจ๏ผๆฏ่พไบ้จๅ้ๅ็ๆง่ฝๆฐดๅนณ๏ผไฝไบQ5๏ผ๏ผ
Related Resources/็ธๅ ณ่ตๆบ
- ๅ้กน็ฎ/Original project: Open-AutoGLM
- llama.cpp: GitHub
License Agreement/ไฝฟ็จ่ฎธๅฏ
This model is licensed under the MIT License. Please refer to the license terms of the original model.
ๆฌๆจกๅ้ตๅพช MIT ่ฎธๅฏ่ฏใ่ฏทๆฅ็ๅๅงๆจกๅ็่ฎธๅฏ่ฏๆกๆฌพใ
- Downloads last month
- 364
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for BlcaCola/AutoGLM-Phone-9B-GGUF
Base model
zai-org/GLM-4-9B-0414