Qwen2.5-1.5B-GSM8K-zh-GGUF

This repo contains GGUF format model files for Qwen2.5-1.5B fine-tuned on the Chinese GSM8K dataset for mathematical reasoning.

The model has been fine-tuned to solve math word problems with step-by-step reasoning in Chinese.

Available Files

Name Quant Method Size Color Description
[model-f16.gguf] None 2.88 GB 🟢 Full precision, maximum accuracy.
[model-q8_0.gguf] Q8_0 ~1.60 GB 🔵 Near-lossless, recommended for high-end use.
[model-q4_k_m.gguf] Q4_K_M ~1.00 GB 🟣 Recommended. Best balance of speed/logic.

Usage (with llama.cpp)

You can use these files with any GGUF-compatible runner (Ollama, LM Studio, llama.cpp).

./llama-cli -m qwen2.5-1.5b-Q4_K_M.gguf -p "<|im_start|>user\n小明有5个苹果,吃了2个,还剩几个?<|im_end|>\n<|im_start|>assistant\n"
Downloads last month
32
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train plasterlabs/Qwen2.5-GGUF