| --- |
| library_name: cellm |
| tags: |
| - mobile |
| - rust |
| - memory-efficient |
| - quantized |
| --- |
| |
| # Cellm Models Hub |
|
|
| This folder contains `.cellm` model artifacts tested with the Cellm Rust CLI. |
|
|
| ## Models |
|
|
| ### Qwen2.5 0.5B Instruct (INT8) |
| - **Path**: `models/qwen2.5-0.5b-int8-v1.cellm` |
| - **Size**: ~472 MB |
| - **Tokenizer**: `models/qwen2.5-0.5b-bnb4/tokenizer.json` |
| - **Type**: INT8 symmetric weight-only |
|
|
| ### Gemma-3 1B IT (INT4, smallest) |
| - **Path**: `models/gemma-3-1b-it-int4-v1.cellm` |
| - **Size**: ~478 MB |
| - **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json` |
| - **Type**: INT4 symmetric weight-only |
|
|
| ### Gemma-3 1B IT (Mixed INT4, recommended) |
| - **Path**: `models/gemma-3-1b-it-mixed-int4-v1.cellm` |
| - **Size**: ~1.0 GB |
| - **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json` |
| - **Type**: Mixed precision (attention/embeddings higher precision, MLP mostly INT4) |
|
|
| ### Gemma-3 1B IT (INT8, most stable) |
| - **Path**: `models/gemma-3-1b-it-int8-v1.cellm` |
| - **Size**: ~1.2 GB |
| - **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json` |
| - **Type**: INT8 symmetric weight-only |
|
|
| ## Usage |
|
|
| From `.`, run: |
|
|
| ```bash |
| ./target/release/infer \ |
| --model models/qwen2.5-0.5b-int8-v1.cellm \ |
| --tokenizer models/qwen2.5-0.5b-bnb4/tokenizer.json \ |
| --prompt "What is sycophancy?" \ |
| --chat \ |
| --gen 64 \ |
| --temperature 0 \ |
| --backend metal \ |
| --kv-encoding f16 |
| ``` |
|
|
| ```bash |
| ./target/release/infer \ |
| --model models/gemma-3-1b-it-mixed-int4-v1.cellm \ |
| --tokenizer models/hf/gemma-3-1b-it-full/tokenizer.json \ |
| --prompt "What is consciousness?" \ |
| --chat \ |
| --chat-format plain \ |
| --gen 48 \ |
| --temperature 0 \ |
| --backend metal \ |
| --kv-encoding f16 |
| ``` |
|
|
| ## About Cellm |
| Cellm is a Rust-native inference runtime focused on mobile/desktop local LLM serving with Metal acceleration and memory-mapped model loading. |
|
|
| ## License |
| Please follow each upstream model license (Qwen and Gemma terms) when redistributing weights and tokenizers. |
|
|