MiniCPM5-1B (LiteRT)

⏳ Upcoming — The LiteRT (.tflite) build of MiniCPM5-1B is on the way. Model weights are not available in this repository yet. Please follow this repo to be notified when they land.

This repository will host the LiteRT (formerly TensorFlow Lite) version of MiniCPM5-1B, optimized for fully on-device inference on mobile and edge hardware.

What is MiniCPM?

MiniCPM5-1B is the first model in the MiniCPM5 series from OpenBMB. It is a dense 1B-parameter Transformer built specifically for on-device, local, and resource-constrained deployment, while reaching 1B-class open-source SOTA in its size class.

Highlights

🏆 1B-class open-source SOTA — strongest in tool use, code generation, and difficult reasoning among comparable open models.
🧠 Hybrid Reasoning — a single checkpoint serves as both a fast assistant and a deliberate reasoner via a built-in <think> template (enable_thinking).
📏 Long context — native 131,072-token context length.
📱 Built for the edge — compact footprint designed for local assistants, coding agents, and tool-use workflows.

Model Information

Item	Value
Type	Causal Language Model
Architecture	Standard `LlamaForCausalLM`
Parameters	1,080,632,832 (~1B)
Non-Embedding Parameters	679,552,512
Layers	24
Attention Heads (GQA)	16 (Q) / 2 (KV)
Context Length	131,072

License

Released under the Apache-2.0 License, consistent with the upstream openbmb/MiniCPM5-1B.

Citation

@article{minicpm4,
  title={MiniCPM4: Ultra-efficient LLMs on end devices},
  author={MiniCPM, Team},
  journal={arXiv preprint arXiv:2506.07900},
  year={2025}
}

Downloads last month: -

Model tree for litert-community/MiniCPM5-1B

Base model

openbmb/MiniCPM5-1B

Finetuned

(25)

this model

Paper for litert-community/MiniCPM5-1B

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9, 2025 • 99

litert-community
/

MiniCPM5-1B