Qwen2.5-1.5B-GSM8K-zh-GGUF

This repo contains GGUF format model files for Qwen2.5-1.5B fine-tuned on the Chinese GSM8K dataset for mathematical reasoning.

The model has been fine-tuned to solve math word problems with step-by-step reasoning in Chinese.

Available Files

Name	Quant Method	Size	Color	Description
[model-f16.gguf]	None	2.88 GB	🟢	Full precision, maximum accuracy.
[model-q8_0.gguf]	Q8_0	~1.60 GB	🔵	Near-lossless, recommended for high-end use.
[model-q4_k_m.gguf]	Q4_K_M	~1.00 GB	🟣	Recommended. Best balance of speed/logic.

You can use these files with any GGUF-compatible runner (Ollama, LM Studio, llama.cpp).

./llama-cli -m qwen2.5-1.5b-Q4_K_M.gguf -p "<|im_start|>user\n小明有5个苹果，吃了2个，还剩几个？<|im_end|>\n<|im_start|>assistant\n"

GGUF

Model size

2B params

Architecture

qwen2

Hardware compatibility

16-bit