jeffasante commited on
Commit
8d4fdaa
·
verified ·
1 Parent(s): 923d4bc

Sync gemma-3-1b-it-int8-v1

Browse files
gemma-3-1b-it-int8-v1/README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: cellm
3
+ tags:
4
+ - mobile
5
+ - rust
6
+ - memory-efficient
7
+ - quantized
8
+ - gemma
9
+ ---
10
+
11
+ # Gemma 3 1B IT (Cellm Int8)
12
+
13
+ This folder contains a Cellm-converted Gemma 3 1B Instruct model and tokenizer assets, ready for publishing to Hugging Face.
14
+
15
+ ## Files
16
+ - `gemma-3-1b-it-int8-v1.cellm`
17
+ - `tokenizer.json`
18
+ - `tokenizer_config.json`
19
+ - `chat_template.jinja`
20
+
21
+ ## Model Details
22
+ - **Base model**: `google/gemma-3-1b-it`
23
+ - **Format**: `.cellm`
24
+ - **Quantization**: INT8 symmetric weight-only
25
+ - **Size**: ~1.2 GB
26
+
27
+ ## Inference Check (cellm)
28
+
29
+ ```bash
30
+ cd /Users/jeff/Desktop/cellm
31
+ ./target/release/infer \
32
+ --model models/to-huggingface/gemma-3-1b-it-int8-v1/gemma-3-1b-it-int8-v1.cellm \
33
+ --tokenizer models/to-huggingface/gemma-3-1b-it-int8-v1/tokenizer.json \
34
+ --prompt "what's twitch.com?" \
35
+ --chat \
36
+ --chat-format plain \
37
+ --gen 48 \
38
+ --temperature 0 \
39
+ --backend cpu \
40
+ --kv-encoding f16
41
+ ```
42
+
43
+ ## Notes
44
+ - This INT8 variant produced coherent output in local validation.
45
+ - INT4 variant was smaller (~481 MB) but quality was significantly worse.
46
+
47
+ ## License
48
+ Subject to Gemma terms and upstream license constraints.
gemma-3-1b-it-int8-v1/chat_template.jinja ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {{ bos_token }}
2
+ {%- if messages[0]['role'] == 'system' -%}
3
+ {%- if messages[0]['content'] is string -%}
4
+ {%- set first_user_prefix = messages[0]['content'] + '
5
+
6
+ ' -%}
7
+ {%- else -%}
8
+ {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
9
+
10
+ ' -%}
11
+ {%- endif -%}
12
+ {%- set loop_messages = messages[1:] -%}
13
+ {%- else -%}
14
+ {%- set first_user_prefix = "" -%}
15
+ {%- set loop_messages = messages -%}
16
+ {%- endif -%}
17
+ {%- for message in loop_messages -%}
18
+ {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
19
+ {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
20
+ {%- endif -%}
21
+ {%- if (message['role'] == 'assistant') -%}
22
+ {%- set role = "model" -%}
23
+ {%- else -%}
24
+ {%- set role = message['role'] -%}
25
+ {%- endif -%}
26
+ {{ '<start_of_turn>' + role + '
27
+ ' + (first_user_prefix if loop.first else "") }}
28
+ {%- if message['content'] is string -%}
29
+ {{ message['content'] | trim }}
30
+ {%- elif message['content'] is iterable -%}
31
+ {%- for item in message['content'] -%}
32
+ {%- if item['type'] == 'image' -%}
33
+ {{ '<start_of_image>' }}
34
+ {%- elif item['type'] == 'text' -%}
35
+ {{ item['text'] | trim }}
36
+ {%- endif -%}
37
+ {%- endfor -%}
38
+ {%- else -%}
39
+ {{ raise_exception("Invalid content type") }}
40
+ {%- endif -%}
41
+ {{ '<end_of_turn>
42
+ ' }}
43
+ {%- endfor -%}
44
+ {%- if add_generation_prompt -%}
45
+ {{'<start_of_turn>model
46
+ '}}
47
+ {%- endif -%}
gemma-3-1b-it-int8-v1/gemma-3-1b-it-int8-v1.cellm ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73cc1fe14164fc1ac208c7c86ead0f1a56d07375f0da52461c282f954ec623d8
3
+ size 1302993856
gemma-3-1b-it-int8-v1/tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
gemma-3-1b-it-int8-v1/tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff