How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="edgemindroboticslabs/MiniCPM5-1B-GGUF",
	filename="minicpm5-1b-q4_k_m.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

MiniCPM5 1B GGUF

This repository contains a Q4_K_M GGUF quantization of openbmb/MiniCPM5-1B.

MiniCPM5-1B is a small on-device text-generation model with long-context and tool-calling tags. This quantized release is intended for local inference with llama.cpp-compatible runtimes such as LM Studio.

Files

File Quantization Approx. size Use case
minicpm5-1b-q4_k_m.gguf Q4_K_M ~656 MiB Good default for Apple Silicon and local CPU inference

Model Details

  • Base model: openbmb/MiniCPM5-1B
  • Architecture: Llama-compatible
  • Parameters: ~1.08B
  • Context length in GGUF metadata: 131072
  • Languages: English, Chinese
  • License: Apache-2.0

Use With llama.cpp

llama-cli \
  -m minicpm5-1b-q4_k_m.gguf \
  -p "<|im_start|>user\nWrite a small Python function that validates an email address.<|im_end|>\n<|im_start|>assistant\n" \
  -n 200 \
  --temp 0.7

Use With LM Studio

Download minicpm5-1b-q4_k_m.gguf and import it as a local GGUF model.

Recommended hardware label:

  • Apple Silicon: Apple M1 Pro or newer
  • Unified memory: 16 GB works for this Q4_K_M file

Conversion

Converted locally with llama.cpp:

hf download openbmb/MiniCPM5-1B --local-dir work/MiniCPM5-1B

uv run --python /opt/homebrew/bin/python3.11 \
  --with-requirements work/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt \
  work/llama.cpp/convert_hf_to_gguf.py \
  work/MiniCPM5-1B \
  --outfile outputs/minicpm5-1b-f16.gguf \
  --outtype f16

work/llama.cpp/build-gguf2/bin/llama-quantize \
  outputs/minicpm5-1b-f16.gguf \
  outputs/minicpm5-1b-q4_k_m.gguf \
  Q4_K_M

Notes

This is a quantized distribution of the upstream model, not a new fine-tune. Quality and behavior are inherited from openbmb/MiniCPM5-1B.

Downloads last month
157
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for edgemindroboticslabs/MiniCPM5-1B-GGUF

Quantized
(35)
this model