Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -14,27 +14,27 @@ This folder contains `.cellm` model artifacts tested with the Cellm Rust CLI.
|
|
| 14 |
## Models
|
| 15 |
|
| 16 |
### Qwen2.5 0.5B Instruct (INT8)
|
| 17 |
-
- **Path**: `
|
| 18 |
- **Size**: ~472 MB
|
| 19 |
-
- **Tokenizer**: `
|
| 20 |
- **Type**: INT8 symmetric weight-only
|
| 21 |
|
| 22 |
### Gemma-3 1B IT (INT4, smallest)
|
| 23 |
-
- **Path**: `
|
| 24 |
- **Size**: ~478 MB
|
| 25 |
-
- **Tokenizer**: `
|
| 26 |
- **Type**: INT4 symmetric weight-only
|
| 27 |
|
| 28 |
### Gemma-3 1B IT (Mixed INT4, recommended)
|
| 29 |
-
- **Path**: `
|
| 30 |
- **Size**: ~1.0 GB
|
| 31 |
-
- **Tokenizer**: `
|
| 32 |
- **Type**: Mixed precision (attention/embeddings higher precision, MLP mostly INT4)
|
| 33 |
|
| 34 |
### Gemma-3 1B IT (INT8, most stable)
|
| 35 |
-
- **Path**: `
|
| 36 |
- **Size**: ~1.2 GB
|
| 37 |
-
- **Tokenizer**: `
|
| 38 |
- **Type**: INT8 symmetric weight-only
|
| 39 |
|
| 40 |
## Usage
|
|
@@ -71,3 +71,4 @@ Cellm is a Rust-native inference runtime focused on mobile/desktop local LLM ser
|
|
| 71 |
|
| 72 |
## License
|
| 73 |
Please follow each upstream model license (Qwen and Gemma terms) when redistributing weights and tokenizers.
|
|
|
|
|
|
| 14 |
## Models
|
| 15 |
|
| 16 |
### Qwen2.5 0.5B Instruct (INT8)
|
| 17 |
+
- **Path**: `models/qwen2.5-0.5b-int8-v1.cellm`
|
| 18 |
- **Size**: ~472 MB
|
| 19 |
+
- **Tokenizer**: `models/qwen2.5-0.5b-bnb4/tokenizer.json`
|
| 20 |
- **Type**: INT8 symmetric weight-only
|
| 21 |
|
| 22 |
### Gemma-3 1B IT (INT4, smallest)
|
| 23 |
+
- **Path**: `models/gemma-3-1b-it-int4-v1.cellm`
|
| 24 |
- **Size**: ~478 MB
|
| 25 |
+
- **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json`
|
| 26 |
- **Type**: INT4 symmetric weight-only
|
| 27 |
|
| 28 |
### Gemma-3 1B IT (Mixed INT4, recommended)
|
| 29 |
+
- **Path**: `models/gemma-3-1b-it-mixed-int4-v1.cellm`
|
| 30 |
- **Size**: ~1.0 GB
|
| 31 |
+
- **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json`
|
| 32 |
- **Type**: Mixed precision (attention/embeddings higher precision, MLP mostly INT4)
|
| 33 |
|
| 34 |
### Gemma-3 1B IT (INT8, most stable)
|
| 35 |
+
- **Path**: `models/gemma-3-1b-it-int8-v1.cellm`
|
| 36 |
- **Size**: ~1.2 GB
|
| 37 |
+
- **Tokenizer**: `models/hf/gemma-3-1b-it-full/tokenizer.json`
|
| 38 |
- **Type**: INT8 symmetric weight-only
|
| 39 |
|
| 40 |
## Usage
|
|
|
|
| 71 |
|
| 72 |
## License
|
| 73 |
Please follow each upstream model license (Qwen and Gemma terms) when redistributing weights and tokenizers.
|
| 74 |
+
|