How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="BlcaCola/AutoGLM-Phone-9B-GGUF",
	filename="",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

AutoGLM-Phone-9B GGUF Quantized Model Collection/AutoGLM-Phone-9B GGUF ้‡ๅŒ–ๆจกๅž‹้›†ๅˆ

Congratulations! This is the most complete and fully usable collection of AutoGLM-Phone-9B model GGUF quantized versions you can find.๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰
ๆญๅ–œไฝ ๏ผ่ฟ™ๆ˜ฏไฝ ่ƒฝๆ‰พๅˆฐๆœ€ๅฎŒๆ•ด๏ผŒๅนถไธ”็ปๅฏนๅฏ็”จ็š„ AutoGLM-Phone-9B ๆจกๅž‹ GGUF ้‡ๅŒ–็‰ˆๆœฌ้›†ๅˆใ€‚๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰

Model Introduction/ๆจกๅž‹็ฎ€ไป‹

Phone Agent is a mobile intelligent assistant framework built on AutoGLM, capable of understanding smartphone screens through multimodal perception and executing automated operations to complete tasks.
AutoGLM-Phone-9B ๆ˜ฏๅŸบไบŽ GLM-4V-9B ็š„ๅคšๆจกๆ€่ง†่ง‰่ฏญ่จ€ๆจกๅž‹๏ผŒไธ“้—จ้’ˆๅฏนๆ‰‹ๆœบ่‡ชๅŠจๅŒ–ๅœบๆ™ฏ่ฟ›่กŒไบ†ไผ˜ๅŒ–ใ€‚่ฏฅๆจกๅž‹่ƒฝๅคŸ็†่งฃๆ‰‹ๆœบๅฑๅน•ๆˆชๅ›พๅนถ็”Ÿๆˆ็›ธๅบ”็š„ๆ“ไฝœๆŒ‡ไปคใ€‚

โš ๏ธPlease note! This is a multimodal vision language model, so in addition to the model itself, you also need the mmproj file. Please be sure to download this file for use!
โš ๏ธ่ฏทๆณจๆ„๏ผ่ฟ™ๆ˜ฏๅคšๆจกๆ€่ง†่ง‰่ฏญ่จ€ๆจกๅž‹๏ผŒๆ‰€ไปฅ้™คไบ†ๆจกๅž‹ๆœฌ่บซ๏ผŒไฝ ่ฟ˜้œ€่ฆmmprojๆ–‡ไปถ๏ผŒ่ฏทๅŠกๅฟ…ไธ‹่ฝฝ่ฟ™ไธชๆ–‡ไปถไธ€่ตทไฝฟ็”จ๏ผ

Available quantization versions/ๅฏ็”จ็š„้‡ๅŒ–็‰ˆๆœฌ

Quantization Type Size Memory Requirement Notes Download Link
Q2_K 3.73 GB ~4 GB Not recommended ไธๆŽจ่ Download
Q3_K_S 4.28 GB ~5 GB Not recommended ไธๆŽจ่ Download
Q3_K_M 4.63 GB ~5 GB Lower quality ่ดจ้‡่พƒไฝŽ Download
Q3_K_L 4.84 GB ~6 GB Lower quality ่ดจ้‡่พƒไฝŽ Download
Q4_0 5.08 GB ~6 GB Minimum available ๆœ€ไฝŽๅฏ็”จ Download
Q4_1 5.60 GB ~6 GB Fast, recommended ๅฟซ้€Ÿ๏ผŒๆŽจ่ Download
Q4_K_S 5.36 GB ~6 GB Fast, recommended ๅฟซ้€Ÿ๏ผŒๆŽจ่ Download
Q4_K_M 5.74 GB ~7 GB โญMost Recommended, balanced ๆœ€ๆŽจ่๏ผŒๅนณ่กกโญ Download
Q5_0 6.11 GB ~7 GB Not recommended ไธๆŽจ่ Download
Q5_1 6.62 GB ~8 GB Not recommended ไธๆŽจ่ Download
Q5_K_S 6.24 GB ~7 GB Good quality ่ดจ้‡ไธ้”™ Download
Q5_K_M 6.57 GB ~8 GB Good quality ่ดจ้‡ไธ้”™ Download
Q6_K 7.70 GB ~9 GB Very good quality ่ดจ้‡้žๅธธๅฅฝ Download
Q8_0 9.31 GB ~11 GB โญFast, best quality ๅฟซ้€Ÿ๏ผŒ่ดจ้‡ๆœ€ๅฅฝโญ Download
F16 17.52 GB ~20 GB 16 bpw, overkill 16 bpw๏ผŒ่ฟ‡้‡ Download

Quick Start/ๅฟซ้€Ÿๅผ€ๅง‹

Using llama.cpp/ไฝฟ็”จ llama.cpp

# Download the model and visual projector
# ไธ‹่ฝฝๆจกๅž‹ๅ’Œ่ง†่ง‰ๆŠ•ๅฝฑๅ™จ
wget https://huggingface.co/BlcaCola/AutoGLM-Phone-9B-GGUF.gguf/resolve/main/AutoGLM-Phone-9B-Q8_0.gguf
wget https://huggingface.co/BlcaCola/AutoGLM-Phone-9B-GGUF.gguf/resolve/main/AutoGLM-Phone-9B-mmproj.gguf

# Start Server
# ๅฏๅŠจๆœๅŠกๅ™จ
./llama-server -m AutoGLM-Phone-9B-Q8_0.gguf --mmproj AutoGLM-Phone-9B-mmproj.gguf --host 0.0.0.0 --port 8080

Performance Comparison/ๆ€ง่ƒฝๅฏนๆฏ”

Here is a chart by ikawrakow comparing the performance levels of partially quantized models (below Q5):
่ฟ™้‡Œๆœ‰ไธ€ๅผ  ikawrakow ็š„ๅ›พ่กจ๏ผŒๆฏ”่พƒไบ†้ƒจๅˆ†้‡ๅŒ–็š„ๆ€ง่ƒฝๆฐดๅนณ๏ผˆไฝŽไบŽQ5๏ผ‰๏ผš

image.png

Related Resources/็›ธๅ…ณ่ต„ๆบ

License Agreement/ไฝฟ็”จ่ฎธๅฏ

This model is licensed under the MIT License. Please refer to the license terms of the original model.
ๆœฌๆจกๅž‹้ตๅพช MIT ่ฎธๅฏ่ฏใ€‚่ฏทๆŸฅ็œ‹ๅŽŸๅง‹ๆจกๅž‹็š„่ฎธๅฏ่ฏๆกๆฌพใ€‚


Downloads last month
364
GGUF
Model size
9B params
Architecture
glm4
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for BlcaCola/AutoGLM-Phone-9B-GGUF

Quantized
(9)
this model