MiniCPM5-1B-mobile

โœ… WORKS โ€” Verified June 2026.

Verification Results

Prompt Response Correct?
The capital of France is "the city of Paris, which is located in the รŽle-de-Cracome, a" โœ…

Note

Use raw completion (no chat format). Best for text continuation.

Model Details

Attribute Value
Base Model openbmb/MiniCPM3-4B
File Size 656 MB
Format GGUF
Chat Format Raw completion (no chat template)
CPU Speed 18.1 tokens/sec
License apache-2.0

Usage

from llama_cpp import Llama

llm = Llama(model_path="model.gguf", n_ctx=512, n_threads=4, verbose=False)
response = llm("The capital of France is", max_tokens=30, echo=False)
print(response["choices"][0]["text"])

๐Ÿš€ dispatchAI

Downloads last month
176
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using dispatchAI/MiniCPM5-1B-mobile 1

Collections including dispatchAI/MiniCPM5-1B-mobile