MiniCPM5-1B (LiteRT)

⏳ Upcoming β€” The LiteRT (.tflite) build of MiniCPM5-1B is on the way. Model weights are not available in this repository yet. Please follow this repo to be notified when they land.

This repository will host the LiteRT (formerly TensorFlow Lite) version of MiniCPM5-1B, optimized for fully on-device inference on mobile and edge hardware.


What is MiniCPM?

MiniCPM5-1B is the first model in the MiniCPM5 series from OpenBMB. It is a dense 1B-parameter Transformer built specifically for on-device, local, and resource-constrained deployment, while reaching 1B-class open-source SOTA in its size class.

Highlights

  • πŸ† 1B-class open-source SOTA β€” strongest in tool use, code generation, and difficult reasoning among comparable open models.
  • 🧠 Hybrid Reasoning β€” a single checkpoint serves as both a fast assistant and a deliberate reasoner via a built-in <think> template (enable_thinking).
  • πŸ“ Long context β€” native 131,072-token context length.
  • πŸ“± Built for the edge β€” compact footprint designed for local assistants, coding agents, and tool-use workflows.

Model Information

Item Value
Type Causal Language Model
Architecture Standard LlamaForCausalLM
Parameters 1,080,632,832 (~1B)
Non-Embedding Parameters 679,552,512
Layers 24
Attention Heads (GQA) 16 (Q) / 2 (KV)
Context Length 131,072

Links


License

Released under the Apache-2.0 License, consistent with the upstream openbmb/MiniCPM5-1B.

Citation

@article{minicpm4,
  title={MiniCPM4: Ultra-efficient LLMs on end devices},
  author={MiniCPM, Team},
  journal={arXiv preprint arXiv:2506.07900},
  year={2025}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for litert-community/MiniCPM5-1B

Finetuned
(25)
this model

Paper for litert-community/MiniCPM5-1B