Image-Text-to-Text
GGUF
Chinese
English
glm4v
multimodal
vision-language
phone-automation
quantized
conversational
Instructions to use Luckybalabala/AutoGLM-Phone-9B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Luckybalabala/AutoGLM-Phone-9B-GGUF", filename="AutoGLM-Phone-9B-Q2_K.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Luckybalabala/AutoGLM-Phone-9B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Luckybalabala/AutoGLM-Phone-9B-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Ollama
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with Ollama:
ollama run hf.co/Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Unsloth Studio new
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Luckybalabala/AutoGLM-Phone-9B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Luckybalabala/AutoGLM-Phone-9B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Luckybalabala/AutoGLM-Phone-9B-GGUF to start chatting
- Docker Model Runner
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with Docker Model Runner:
docker model run hf.co/Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
- Lemonade
How to use Luckybalabala/AutoGLM-Phone-9B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Luckybalabala/AutoGLM-Phone-9B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.AutoGLM-Phone-9B-GGUF-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
}
}
]
}
]
)AutoGLM-Phone-9B GGUF 量化模型集合
这是 AutoGLM-Phone-9B 模型的完整 GGUF 量化版本集合,专门为手机自动化任务优化。
🎯 模型简介
AutoGLM-Phone-9B 是基于 GLM-4V-9B 的多模态视觉语言模型,专门针对手机自动化场景进行了优化。该模型能够理解手机屏幕截图并生成相应的操作指令。
📦 可用的量化版本
| 量化类型 | 文件大小 | 内存需求 | 推荐用途 | 下载链接 |
|---|---|---|---|---|
| Q2_K | 3.73 GB | ~4 GB | 极限内存环境 | 下载 |
| Q3_K_S | 4.28 GB | ~5 GB | 低内存设备 | 下载 |
| Q3_K_M | 4.63 GB | ~5 GB | 平衡性能/内存 | 下载 |
| Q3_K_L | 4.84 GB | ~6 GB | 稍好质量 | 下载 |
| Q4_0 | 5.08 GB | ~6 GB | 传统4位量化 | 下载 |
| Q4_1 | 5.60 GB | ~6 GB | 改进4位量化 | 下载 |
| Q4_K_S | 5.36 GB | ~6 GB | 推荐-小显卡 | 下载 |
| Q4_K_M | 5.74 GB | ~7 GB | 推荐-平衡 ⭐ | 下载 |
| Q5_0 | 6.11 GB | ~7 GB | 传统5位量化 | 下载 |
| Q5_1 | 6.62 GB | ~8 GB | 改进5位量化 | 下载 |
| Q5_K_S | 6.24 GB | ~7 GB | 高质量-小 | 下载 |
| Q5_K_M | 6.57 GB | ~8 GB | 高质量-中 | 下载 |
| Q6_K | 7.70 GB | ~9 GB | 接近原始质量 | 下载 |
| Q8_0 | 9.31 GB | ~11 GB | 最高质量 | 下载 |
| F16 | 17.52 GB | ~20 GB | 原始精度 | 下载 |
🚀 快速开始
使用 llama.cpp
# 下载模型和视觉投影器
wget https://huggingface.co/Luckybalabala/AutoGLM-Phone-9B-Q4_K_M.gguf/resolve/main/AutoGLM-Phone-9B-Q4_K_M.gguf
wget https://huggingface.co/Luckybalabala/AutoGLM-Phone-9B-Q4_K_M.gguf/resolve/main/AutoGLM-Phone-9B-mmproj.gguf
# 启动服务器
./llama-server -m AutoGLM-Phone-9B-Q4_K_M.gguf --mmproj AutoGLM-Phone-9B-mmproj.gguf --host 0.0.0.0 --port 8080
与 Open-AutoGLM 集成
# 克隆 Open-AutoGLM 项目
git clone https://github.com/OpenBMB/AutoGLM.git
cd AutoGLM
# 配置模型 API
python main.py --base-url http://localhost:8080/v1
💻 系统要求
推荐配置
- 8GB 显卡: Q4_K_M 或 Q5_K_S
- 12GB 显卡: Q5_K_M 或 Q6_K
- 16GB+ 显卡: Q8_0 或 F16
- CPU 推理: Q4_K_M 或更低
最低要求
- 操作系统: Windows 10/11, Linux, macOS
- 内存: 8GB+ RAM
- 存储: 根据选择的量化版本
🔧 技术细节
- 基础模型: THUDM/glm-4v-9b
- 量化工具: llama.cpp quantize
- 支持格式: GGUF
- 多模态: 支持图像+文本输入
- API: OpenAI 兼容接口
📊 性能对比
| 量化类型 | 推理速度 | 内存占用 | 质量保持 | 推荐场景 |
|---|---|---|---|---|
| Q2_K | 最快 | 最低 | 70% | 资源受限 |
| Q4_K_M | 快 | 中等 | 85% | 平衡推荐 |
| Q6_K | 中等 | 较高 | 95% | 高质量需求 |
| Q8_0 | 较慢 | 高 | 98% | 最佳质量 |
🔗 相关资源
- 原项目: Open-AutoGLM
- llama.cpp: GitHub
- 演示视频: B站演示
- 技术博客: AutoGLM 集成指南
📝 使用许可
本模型遵循 Apache 2.0 许可证。请查看原始模型的许可证条款。
⚠️ 注意事项
- 模型用途: 专门用于手机自动化任务,其他用途效果可能不佳
- 安全提醒: 请在受控环境中测试,避免在重要设备上直接使用
- 性能差异: 不同量化级别的性能和质量存在差异,请根据需求选择
- 更新频率: 模型会根据 Open-AutoGLM 项目更新而更新
🤝 贡献
欢迎提交 Issue 和建议来改进这个模型集合。
标签: GLM-4V 多模态 手机自动化 GGUF 量化模型 llama.cpp
- Downloads last month
- 498
Hardware compatibility
Log In to add your hardware
Model tree for Luckybalabala/AutoGLM-Phone-9B-GGUF
Base model
zai-org/glm-4v-9b
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Luckybalabala/AutoGLM-Phone-9B-GGUF", filename="", )