Abiray/BitCPM4-CANN-0.5B-GGUF

This repository contains quantized GGUF formats of the openbmb/BitCPM4-CANN-0.5B model, heavily optimized for local inference using llama.cpp, text-generation-webui, LM Studio, Ollama, and other compatible backend frameworks.

Model Information

Available Files & Hardware Compatibility

The following quantization formats are available. Because this is an ultra-compact 500M parameter model, it can run blazingly fast on almost any modern device, including microcontrollers, older smartphones, and edge computing hardware.

Filename Quant Type File Size Description
BitCPM4-CANN-0.5B-F16.gguf 16-bit 870 MB The unquantized base model weights in full precision. Maximum possible fidelity.
BitCPM4-CANN-0.5B-Q8_0.gguf 8-bit 463 MB Near-perfect accuracy retention. Offers a massive size reduction while acting indistinguishably from the F16 version.
BitCPM4-CANN-0.5B-Q6_K.gguf 6-bit 358 MB Excellent option for low-resource edge devices demanding strong logic retention.
BitCPM4-CANN-0.5B-Q5_K_M.gguf 5-bit 317 MB Great middle-ground for balancing speed, size, and remaining reasoning capability.
BitCPM4-CANN-0.5B-Q5_K_S.gguf 5-bit 310 MB Slightly more aggressive 5-bit compression format focused on minimizing footprint.
BitCPM4-CANN-0.5B-Q4_K_M.gguf 4-bit 279 MB Recommended. The absolute sweet spot for local 4-bit execution, maintaining surprising coherence for its sub-300MB size.
BitCPM4-CANN-0.5B-Q4_K_S.gguf 4-bit 267 MB Highly optimized for speed. Perfect for deeply embedded systems or background text processing.
BitCPM4-CANN-0.5B-Q3_K_M.gguf 3-bit 235 MB Ultimate compression limit. Use exclusively under extremely severe hardware memory limitations.

How to Run

Using llama.cpp (Command Line)

If you have compiled llama.cpp, you can run the model directly from your terminal. Replace the filename with the specific version you downloaded:

./llama-cli \
  -m BitCPM4-CANN-0.5B-Q4_K_M.gguf \
  -p "Explain the concept of artificial intelligence to a five-year-old." \
  -n 256 \
  -c 2048 \
  --temp 0.7
Downloads last month
275
GGUF
Model size
0.4B params
Architecture
minicpm
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abiray/BitCPM4-CANN-0.5B-GGUF

Quantized
(4)
this model

Collection including Abiray/BitCPM4-CANN-0.5B-GGUF