LemOneLabs Xenova HF Staff commited on
Commit
944fef9
·
0 Parent(s):

Duplicate from onnx-community/functiongemma-270m-it-ONNX

Browse files

Co-authored-by: Joshua <Xenova@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ onnx/model.onnx_data filter=lfs diff=lfs merge=lfs -text
37
+ onnx/model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
38
+ onnx/model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
39
+ onnx/model_q4f16.onnx_data filter=lfs diff=lfs merge=lfs -text
40
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,550 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ base_model:
4
+ - google/functiongemma-270m-it
5
+ library_name: transformers.js
6
+ ---
7
+
8
+
9
+ # FunctionGemma model card
10
+
11
+ **Model Page**: [FunctionGemma](https://ai.google.dev/gemma/docs/functiongemma)
12
+
13
+ **Resources and Technical Documentation**:
14
+
15
+ - [Responsible Generative AI Toolkit](https://ai.google.dev/responsible)
16
+ - [FunctionGemma on Kaggle](https://www.kaggle.com/models/google/functiongemma/)
17
+ - [FunctionGemma on Vertex Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/functiongemma)
18
+
19
+ **Terms of Use**: [Terms](https://ai.google.dev/gemma/terms)\
20
+ **Authors**: Google DeepMind
21
+
22
+ ## Model Information
23
+
24
+ Summary description and brief definition of inputs and outputs.
25
+
26
+ ### Description
27
+
28
+ > [!Note]
29
+ > FunctionGemma is intended to be fine-tuned for your specific function-calling task, including multi-turn use cases.
30
+
31
+
32
+ FunctionGemma is a lightweight, open model from Google, built as a foundation
33
+ for creating your own specialized function calling models. FunctionGemma is not
34
+ intended for use as a direct dialogue model, and is designed to be highly
35
+ performant after further fine-tuning, as is typical of models this size. Built
36
+ on the Gemma 3 270M model and with the same research and technology used to
37
+ create the Gemini models, FunctionGemma has been trained specifically for
38
+ function calling. The model has the same architecture as Gemma 3, but uses a
39
+ different chat format. The model is well suited for text-only function calling.
40
+ The uniquely small size makes it possible to deploy in environments with limited
41
+ resources such as laptops, desktops or your own cloud infrastructure,
42
+ democratizing access to state of the art AI models and helping foster innovation
43
+ for everyone. Furthermore, akin to the base Gemma 270M, the model has been
44
+ optimized to be extremely versatile, performant on a variety of hardware in
45
+ single turn scenarios, but should be finetuned on single turn or multiturn task
46
+ specific data to achieve best accuracy in specific domains.
47
+ To demonstrate how specializing the 270M parameter model can achieve high
48
+ performance on specific agentic workflows, we have highlighted two use cases in
49
+ the
50
+ [Google AI Edge Gallery app](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery&pcampaignid=web_share).
51
+
52
+ - **Tiny Garden:** A model fine-tuned to power a voice-controlled
53
+ interactive game. It handles game logic to manage a virtual plot of land,
54
+ decomposing commands like "Plant sunflowers in the top row" and "Water the
55
+ flowers in plots 1 and 2" into app-specific functions (e.g., plant_seed,
56
+ water_plots) and coordinate targets. This demonstrates the model's capacity
57
+ to drive custom app mechanics without server connectivity.
58
+
59
+ - **Mobile Actions:** To empower developers to build their own expert
60
+ agents, we have published [a
61
+ dataset](https://huggingface.co/datasets/google/mobile-actions) and
62
+ [fine-tuning recipe](https://github.com/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb)
63
+ to demonstrate fine-tuning FunctionGemma. It translates user inputs (e.g.,
64
+ "Create a calendar event for lunch," "Turn on the flashlight") into
65
+ function calls that trigger Android OS system tools. This interactive
66
+ notebook demonstrates how to take the base FunctionGemma model and build a
67
+ "Mobile Actions" fine tune from scratch for use in the
68
+ [Google AI Edge gallery app](https://play.google.com/store/apps/details?id=com.google.ai.edge.gallery&pcampaignid=web_share).
69
+ This use case demonstrates the model's ability to act as an offline,
70
+ private agent for personal device tasks.
71
+
72
+ ### Inputs and outputs
73
+
74
+ - **Input:**
75
+ - Text string, such as a question, a prompt, or a document to be
76
+ summarized
77
+ - Total input context of 32K tokens
78
+ - **Output:**
79
+ - Generated text in response to the input, such as an answer to a
80
+ question, or a summary of a document
81
+ - Total output context up to 32K tokens per request, subtracting
82
+ the request input tokens
83
+
84
+ ### Basic Usage
85
+
86
+ The following is a code example of how to use FunctionGemma to generate a function call from a JSON definition using the Hugging Face Transformers.js library.
87
+
88
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
89
+ ```bash
90
+ npm i @huggingface/transformers
91
+ ```
92
+
93
+ You can then use the model as follows:
94
+
95
+ ```js
96
+ import { AutoModelForCausalLM, AutoTokenizer } from "@huggingface/transformers";
97
+
98
+ // Load the model and tokenizer
99
+ const model_id = "onnx-community/functiongemma-270m-it-ONNX";
100
+ const tokenizer = await AutoTokenizer.from_pretrained(model_id);
101
+ const model = await AutoModelForCausalLM.from_pretrained(model_id);
102
+
103
+ const weather_function_schema = {
104
+ type: "function",
105
+ function: {
106
+ name: "get_current_temperature",
107
+ description: "Gets the current temperature for a given location.",
108
+ parameters: {
109
+ type: "object",
110
+ properties: {
111
+ location: {
112
+ type: "string",
113
+ description: "The city name, e.g. San Francisco",
114
+ },
115
+ },
116
+ required: ["location"],
117
+ },
118
+ },
119
+ };
120
+
121
+ const messages = [
122
+ {
123
+ role: "developer",
124
+ content: "You are a model that can do function calling with the following functions",
125
+ },
126
+ {
127
+ role: "user",
128
+ content: "What's the temperature in London?",
129
+ },
130
+ ];
131
+
132
+ const inputs = tokenizer.apply_chat_template(messages, {
133
+ tools: [weather_function_schema],
134
+ tokenize: true,
135
+ add_generation_prompt: true,
136
+ return_dict: true,
137
+ });
138
+
139
+ const output = await model.generate({ ...inputs, max_new_tokens: 512 });
140
+ const decoded = tokenizer.decode(output.slice(0, [inputs.input_ids.dims[1], null]), { skip_special_tokens: false });
141
+ console.log(decoded);
142
+ // <start_function_call>call:get_current_temperature{location:<escape>London<escape>}<end_function_call><start_function_response>
143
+ ```
144
+
145
+ For more detailed examples see the [Gemma documentation](https://ai.google.dev/gemma/docs/functiongemma).
146
+
147
+ ## Model Data
148
+
149
+ Data used for model training and how the data was processed.
150
+
151
+ ### Training Dataset
152
+
153
+ These models were trained on a dataset of text data that includes a wide
154
+ variety of sources. The model was trained with 6T tokens. The knowledge cutoff
155
+ date for the training data was August 2024. There are the key components:
156
+
157
+ - Public Tool Definitions - Common APIs found on the web
158
+ - Tool Use Interactions - These are a mix of prompts, function calls,
159
+ function responses, and natural language responses from the model to
160
+ summarise the function call response, or request clarifications when the
161
+ prompt is ambiguous or incomplete.
162
+
163
+ ### Data Preprocessing
164
+
165
+ Here are the key data cleaning and filtering methods applied to the training
166
+ data:
167
+
168
+ - CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering
169
+ was applied at multiple stages in the data preparation process to ensure
170
+ the exclusion of harmful and illegal content.
171
+ - Sensitive Data Filtering: As part of making Gemma pre-trained models
172
+ safe and reliable, automated techniques were used to filter out certain
173
+ personal information and other sensitive data from training sets.
174
+ - Additional methods: Filtering based on content quality and safety in
175
+ line with
176
+ [our policies](https://ai.google/static/documents/ai-responsibility-update-published-february-2025.pdf).
177
+
178
+ ## Implementation Information
179
+
180
+ Details about the model internals.
181
+
182
+ ### Hardware
183
+
184
+ Gemma was trained using [Tensor Processing Unit
185
+ (TPU)](https://cloud.google.com/tpu/docs/intro-to-tpu) hardware (TPUv4p, TPUv5p
186
+ and TPUv5e). Training vision-language models (VLMs) requires significant
187
+ computational power. TPUs, designed specifically for matrix operations common in
188
+ machine learning, offer several advantages in this domain:
189
+
190
+ - Performance: TPUs are specifically designed to handle the massive
191
+ computations involved in training VLMs. They can speed up training
192
+ considerably compared to CPUs.
193
+ - Memory: TPUs often come with large amounts of high-bandwidth memory,
194
+ allowing for the handling of large models and batch sizes during training.
195
+ This can lead to better model quality.
196
+ - Scalability: TPU Pods (large clusters of TPUs) provide a scalable
197
+ solution for handling the growing complexity of large foundation models.
198
+ You can distribute training across multiple TPU devices for faster and more
199
+ efficient processing.
200
+ - Cost-effectiveness: In many scenarios, TPUs can provide a more
201
+ cost-effective solution for training large models compared to CPU-based
202
+ infrastructure, especially when considering the time and resources saved
203
+ due to faster training.
204
+ - These advantages are aligned with
205
+ [Google's commitments to operate sustainably](https://sustainability.google/operating-sustainably/).
206
+
207
+ ### Software
208
+
209
+ Training was done using [JAX](https://github.com/jax-ml/jax) and
210
+ [ML Pathways](https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/).
211
+ JAX allows researchers to take advantage of the latest generation of hardware,
212
+ including TPUs, for faster and more efficient training of large models. ML
213
+ Pathways is Google's latest effort to build artificially intelligent systems
214
+ capable of generalizing across multiple tasks. This is specially suitable for
215
+ foundation models, including large language models like these ones.\
216
+ Together, JAX and ML Pathways are used as described in the [paper about the
217
+ Gemini family of models](https://goo.gle/gemma2report); *"the 'single
218
+ controller' programming model of Jax and Pathways allows a single Python process
219
+ to orchestrate the entire training run, dramatically simplifying the development
220
+ workflow."*
221
+
222
+ ## Evaluation
223
+
224
+ Model evaluation metrics and results.
225
+
226
+ ### Benchmark Results
227
+
228
+ <table>
229
+ <thead>
230
+ <tr>
231
+ <th><strong>Benchmark</strong></th>
232
+ <th><strong>n-shot</strong></th>
233
+ <th><strong>Function Gemma 270m</strong></th>
234
+ </tr>
235
+ </thead>
236
+ <tbody>
237
+ <tr>
238
+ <td>BFCL Simple</td>
239
+ <td>0-shot</td>
240
+ <td>61.6</td>
241
+ </tr>
242
+ <tr>
243
+ <td>BFCL Multiple</td>
244
+ <td>0-shot</td>
245
+ <td>63.5</td>
246
+ </tr>
247
+ <tr>
248
+ <td>BFCL Parallel</td>
249
+ <td>0-shot</td>
250
+ <td>39</td>
251
+ </tr>
252
+ <tr>
253
+ <td>BFCL Parallel Multiple</td>
254
+ <td>0-shot</td>
255
+ <td>29.5</td>
256
+ </tr>
257
+ <tr>
258
+ <td>BFCL Live Simple </td>
259
+ <td>0-shot</td>
260
+ <td>36.2</td>
261
+ </tr>
262
+ <tr>
263
+ <td>BFCL Live Multiple</td>
264
+ <td>0-shot</td>
265
+ <td>25.7</td>
266
+ </tr>
267
+ <tr>
268
+ <td>BFCL Live Parallel</td>
269
+ <td>0-shot</td>
270
+ <td>22.9</td>
271
+ </tr>
272
+ <tr>
273
+ <td>BFCL Live Parallel Multiple</td>
274
+ <td>0-shot</td>
275
+ <td>20.8</td>
276
+ </tr>
277
+ <tr>
278
+ <td>BFCL Relevance</td>
279
+ <td>0-shot</td>
280
+ <td>61.1</td>
281
+ </tr>
282
+ <tr>
283
+ <td>BFCL Irrelevance</td>
284
+ <td>0-shot</td>
285
+ <td>73.7</td>
286
+ </tr>
287
+ </tbody>
288
+ </table>
289
+
290
+ **Impact on Performance after Fine-tuning on Mobile Actions Dataset**\
291
+ To demonstrate the value of specialization for small language models, we
292
+ compared the base FunctionGemma model against the fine-tuned model using the
293
+ "Mobile Actions"
294
+ [recipe](https://github.com/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb).
295
+ Fine-tuning significantly improved the base FunctionGemma model's ability to
296
+ correctly identify and format mobile system calls.
297
+
298
+ <table>
299
+ <thead>
300
+ <tr>
301
+ <th><br>
302
+ Model</th>
303
+ <th><br>
304
+ Eval results for Mobile Actions</th>
305
+ </tr>
306
+ </thead>
307
+ <tbody>
308
+ <tr>
309
+ <td><br>
310
+ Base FunctionGemma model</td>
311
+ <td><br>
312
+ 58%</td>
313
+ </tr>
314
+ <tr>
315
+ <td><br>
316
+ Mobile Actions Fine-Tune</td>
317
+ <td><br>
318
+ 85%</td>
319
+ </tr>
320
+ </tbody>
321
+ </table>
322
+
323
+ **On-Device Performance of the Gemma 270m Fine-tuned Use Cases**\
324
+ We evaluated the fine-tuned use cases on a Samsung S25 Ultra to assess on-device
325
+ latency and memory footprint.
326
+
327
+ - **Context:** 512 prefill tokens and 32 decode tokens.
328
+ - **Hardware:** S25 Ultra CPU using LiteRT XNNPACK delegate with 4 threads.
329
+
330
+ Mobile Actions On Device Performance
331
+
332
+ <table>
333
+ <thead>
334
+ <tr>
335
+ <th><br>
336
+ Backend</th>
337
+ <th><br>
338
+ Quantization scheme</th>
339
+ <th><br>
340
+ Context length</th>
341
+ <th><br>
342
+ Prefill (tokens per second)</th>
343
+ <th><br>
344
+ Decode (tokens per second)</th>
345
+ <th><br>
346
+ Time-to-first-token (seconds)</th>
347
+ <th><br>
348
+ Model Size (MB)</th>
349
+ <th><br>
350
+ Peak RSS Memory (MB)</th>
351
+ </tr>
352
+ </thead>
353
+ <tbody>
354
+ <tr>
355
+ <td><br>
356
+ CPU</td>
357
+ <td><br>
358
+ dynamic_int8</td>
359
+ <td><br>
360
+ 1024</td>
361
+ <td><br>
362
+ 1718</td>
363
+ <td><br>
364
+ 125.9</td>
365
+ <td><br>
366
+ 0.3</td>
367
+ <td><br>
368
+ 288</td>
369
+ <td><br>
370
+ 551</td>
371
+ </tr>
372
+ </tbody>
373
+ </table>
374
+
375
+ Tiny Garden On Device Performance
376
+
377
+ <table>
378
+ <thead>
379
+ <tr>
380
+ <th><br>
381
+ Backend</th>
382
+ <th><br>
383
+ Quantization scheme</th>
384
+ <th><br>
385
+ Context length</th>
386
+ <th><br>
387
+ Prefill (tokens per second)</th>
388
+ <th><br>
389
+ Decode (tokens per second)</th>
390
+ <th><br>
391
+ Time-to-first-token (seconds)</th>
392
+ <th><br>
393
+ Model Size (MB)</th>
394
+ <th><br>
395
+ Peak RSS Memory (MB)</th>
396
+ </tr>
397
+ </thead>
398
+ <tbody>
399
+ <tr>
400
+ <td><br>
401
+ CPU</td>
402
+ <td><br>
403
+ dynamic_int8</td>
404
+ <td><br>
405
+ 1024</td>
406
+ <td><br>
407
+ 1743</td>
408
+ <td><br>
409
+ 125.7</td>
410
+ <td><br>
411
+ 0.3</td>
412
+ <td><br>
413
+ 288</td>
414
+ <td><br>
415
+ 549</td>
416
+ </tr>
417
+ </tbody>
418
+ </table>
419
+
420
+ ## Ethics and Safety
421
+
422
+ Ethics and safety evaluation approach and results.
423
+
424
+ ### Evaluation Approach
425
+
426
+ Our evaluation methods include structured evaluations and internal red-teaming
427
+ testing of relevant content policies. Red-teaming was conducted by a number of
428
+ different teams, each with different goals and human evaluation metrics. These
429
+ models were evaluated against a number of different categories relevant to
430
+ ethics and safety, including:
431
+
432
+ - **Child Safety**: Evaluation of text-to-text and image to text prompts
433
+ covering child safety policies, including child sexual abuse and exploitation.
434
+ - **Content Safety:** Evaluation of text-to-text and image to text prompts
435
+ covering safety policies including, harassment, violence and gore, and hate
436
+ speech.
437
+ - **Representational Harms**: Evaluation of text-to-text and image to text
438
+ prompts covering safety policies including bias, stereotyping, and harmful
439
+ associations or inaccuracies.
440
+
441
+ ### Evaluation Results
442
+
443
+ For all areas of safety testing, we saw major improvements in the categories of
444
+ child safety, content safety, and representational harms relative to previous
445
+ Gemma models. All testing was conducted without safety filters to evaluate the
446
+ model capabilities and behaviors. The model produced minimal policy violations,
447
+ and showed significant improvements over previous Gemma models' performance
448
+ with respect to ungrounded inferences. A limitation of our evaluations was they
449
+ included only English language prompts.
450
+
451
+ ## Usage and Limitations
452
+
453
+ These models have certain limitations that users should be aware of.
454
+
455
+ ### Intended Usage
456
+
457
+ This model is not intended for use as a direct dialogue model.\
458
+ Open Large Language Models (LLMs) have a wide range of applications across
459
+ various industries and domains. The following list of potential uses is not
460
+ comprehensive. The purpose of this list is to provide contextual information
461
+ about the possible use-cases that the model creators considered as part of model
462
+ training and development.
463
+
464
+ - Content Creation and Communication
465
+ - Text Generation: These models can be used to generate creative
466
+ text formats such as poems, scripts, code, marketing copy, and email drafts.
467
+ - Chatbots and Conversational AI: Power conversational interfaces
468
+ for customer service, virtual assistants, or interactive applications.
469
+ - Text Summarization: Generate concise summaries of a text corpus,
470
+ research papers, or reports.
471
+ - Research and Education
472
+ - Natural Language Processing (NLP) Research: These models can
473
+ serve as a foundation for researchers to experiment with NLP
474
+ techniques, develop algorithms, and contribute to the advancement of the field.
475
+ - Language Learning Tools: Support interactive language learning
476
+ experiences, aiding in grammar correction or providing writing practice.
477
+ - Knowledge Exploration: Assist researchers in exploring large
478
+ bodies of text by generating summaries or answering questions about
479
+ specific topics.
480
+
481
+ ### Limitations
482
+
483
+ - Training Data
484
+ - The quality and diversity of the training data significantly
485
+ influence the model's capabilities. Biases or gaps in the training data
486
+ can lead to limitations in the model's responses.
487
+ - The scope of the training dataset determines the subject areas
488
+ the model can handle effectively.
489
+ - Context and Task Complexity
490
+ - Models are better at tasks that can be framed with clear
491
+ prompts and instructions. Open-ended or highly complex tasks might be
492
+ challenging.
493
+ - A model's performance can be influenced by the amount of context
494
+ provided (longer context generally leads to better outputs, up to a
495
+ certain point).
496
+ - Language Ambiguity and Nuance
497
+ - Natural language is inherently complex. Models might struggle
498
+ to grasp subtle nuances, sarcasm, or figurative language.
499
+ - Factual Accuracy
500
+ - Models generate responses based on information they learned
501
+ from their training datasets, but they are not knowledge bases. They
502
+ may generate incorrect or outdated factual statements.
503
+ - Common Sense
504
+ - Models rely on statistical patterns in language. They might
505
+ lack the ability to apply common sense reasoning in certain situations.
506
+
507
+ ### Ethical Considerations and Risks
508
+
509
+ The development of large language models (LLMs) raises several ethical
510
+ concerns. In creating an open model, we have carefully considered the
511
+ following:
512
+
513
+ - Bias and Fairness
514
+ - LLMs trained on large-scale, real-world text data can reflect
515
+ socio-cultural biases embedded in the training material. These models
516
+ underwent careful scrutiny, input data pre-processing described and
517
+ posterior evaluations reported in this card.
518
+ - Misinformation and Misuse
519
+ - LLMs can be misused to generate text that is false, misleading,
520
+ or harmful.
521
+ - Guidelines are provided for responsible use with the model, see
522
+ the [Responsible Generative AI Toolkit](https://ai.google.dev/responsible).
523
+ - Transparency and Accountability:
524
+ - This model card summarizes details on the models' architecture,
525
+ capabilities, limitations, and evaluation processes.
526
+ - A responsibly developed open model offers the opportunity to
527
+ share innovation by making LLM technology accessible to developers and
528
+ researchers across the AI ecosystem.
529
+
530
+ Risks identified and mitigations:
531
+
532
+ - Perpetuation of biases: It's encouraged to perform continuous
533
+ monitoring (using evaluation metrics, human review) and the exploration of
534
+ de-biasing techniques during model training, fine-tuning, and other use cases.
535
+ - Generation of harmful content: Mechanisms and guidelines for content
536
+ safety are essential. Developers are encouraged to exercise caution and
537
+ implement appropriate content safety safeguards based on their specific
538
+ product policies and application use cases.
539
+ - Misuse for malicious purposes: Technical limitations and developer and
540
+ end-user education can help mitigate against malicious applications of
541
+ LLMs. Educational resources and reporting mechanisms for users to flag
542
+ misuse are provided. Prohibited uses of Gemma models are outlined in the
543
+ [Gemma Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy)..
544
+ - Privacy violations: Models were trained on data filtered for removal of
545
+ PII (Personally Identifiable Information). Developers are encouraged to
546
+ adhere to privacy regulations with privacy-preserving techniques.
547
+
548
+ ### Benefits
549
+
550
+ At the time of release, this family of models provides high-performance open large language model implementations designed from the ground up for Responsible AI development compared to similarly sized models.
chat_template.jinja ADDED
@@ -0,0 +1,279 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- macro format_parameters(properties, required) -%}
2
+ {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
3
+ {%- set ns = namespace(found_first=false) -%}
4
+ {%- for key, value in properties | dictsort -%}
5
+ {%- if key not in standard_keys -%}
6
+ {%- if ns.found_first %},{% endif -%}
7
+ {%- set ns.found_first = true -%}
8
+ {{- key }}:{description:<escape>{{ value['description'] }}<escape>
9
+ {%- if value['type'] | upper == 'STRING' -%}
10
+ {%- if value['enum'] -%}
11
+ ,enum:{{ format_argument(value['enum']) }}
12
+ {%- endif -%}
13
+ {%- elif value['type'] | upper == 'OBJECT' -%}
14
+ ,properties:{
15
+ {%- if value['properties'] is defined and value['properties'] is mapping -%}
16
+ {{- format_parameters(value['properties'], value['required'] | default([])) -}}
17
+ {%- elif value is mapping -%}
18
+ {{- format_parameters(value, value['required'] | default([])) -}}
19
+ {%- endif -%}
20
+ }
21
+ {%- if value['required'] -%}
22
+ ,required:[
23
+ {%- for item in value['required'] | default([]) -%}
24
+ <escape>{{- item -}}<escape>
25
+ {%- if not loop.last %},{% endif -%}
26
+ {%- endfor -%}
27
+ ]
28
+ {%- endif -%}
29
+ {%- elif value['type'] | upper == 'ARRAY' -%}
30
+ {%- if value['items'] is mapping and value['items'] -%}
31
+ ,items:{
32
+ {%- set ns_items = namespace(found_first=false) -%}
33
+ {%- for item_key, item_value in value['items'] | dictsort -%}
34
+ {%- if item_value is not none -%}
35
+ {%- if ns_items.found_first %},{% endif -%}
36
+ {%- set ns_items.found_first = true -%}
37
+ {%- if item_key == 'properties' -%}
38
+ properties:{
39
+ {%- if item_value is mapping -%}
40
+ {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
41
+ {%- endif -%}
42
+ }
43
+ {%- elif item_key == 'required' -%}
44
+ required:[
45
+ {%- for req_item in item_value -%}
46
+ <escape>{{- req_item -}}<escape>
47
+ {%- if not loop.last %},{% endif -%}
48
+ {%- endfor -%}
49
+ ]
50
+ {%- elif item_key == 'type' -%}
51
+ {%- if item_value is string -%}
52
+ type:{{ format_argument(item_value | upper) }}
53
+ {%- else -%}
54
+ type:{{ format_argument(item_value | map('upper') | list) }}
55
+ {%- endif -%}
56
+ {%- else -%}
57
+ {{ item_key }}:{{ format_argument(item_value) }}
58
+ {%- endif -%}
59
+ {%- endif -%}
60
+ {%- endfor -%}
61
+ }
62
+ {%- endif -%}
63
+ {%- endif -%}
64
+ ,type:<escape>{{ value['type'] | upper }}<escape>}
65
+ {%- endif -%}
66
+ {%- endfor -%}
67
+ {%- endmacro -%}
68
+ {% macro format_function_declaration(tool_data) -%}
69
+ declaration:{{- tool_data['function']['name'] -}}
70
+ {description:<escape>{{- tool_data['function']['description'] -}}<escape>
71
+ {%- set params = tool_data['function']['parameters'] -%}
72
+ {%- if params -%}
73
+ ,parameters:{
74
+ {%- if params['properties'] -%}
75
+ properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
76
+ {%- endif -%}
77
+ {%- if params['required'] -%}
78
+ required:[
79
+ {%- for item in params['required'] -%}
80
+ <escape>{{- item -}}<escape>
81
+ {{- ',' if not loop.last -}}
82
+ {%- endfor -%}
83
+ ],
84
+ {%- endif -%}
85
+ {%- if params['type'] -%}
86
+ type:<escape>{{- params['type'] | upper -}}<escape>}
87
+ {%- endif -%}
88
+ {%- endif -%}
89
+ }
90
+ {%- endmacro -%}
91
+ {% macro format_argument(argument, escape_keys=True) -%}
92
+ {%- if argument is string -%}
93
+ {{- '<escape>' + argument + '<escape>' -}}
94
+ {%- elif argument is boolean -%}
95
+ {%- if argument -%}
96
+ {{- 'true' -}}
97
+ {%- else -%}
98
+ {{- 'false' -}}
99
+ {%- endif -%}
100
+ {%- elif argument is mapping -%}
101
+ {{- '{' -}}
102
+ {%- set ns = namespace(found_first=false) -%}
103
+ {%- for key, value in argument | dictsort -%}
104
+ {%- if ns.found_first %},{% endif -%}
105
+ {%- set ns.found_first = true -%}
106
+ {%- if escape_keys -%}
107
+ {{- '<escape>' + key + '<escape>' -}}
108
+ {%- else -%}
109
+ {{- key -}}
110
+ {%- endif -%}
111
+ :{{- format_argument(value, escape_keys=escape_keys) -}}
112
+ {%- endfor -%}
113
+ {{- '}' -}}
114
+ {%- elif argument is sequence -%}
115
+ {{- '[' -}}
116
+ {%- for item in argument -%}
117
+ {{- format_argument(item, escape_keys=escape_keys) -}}
118
+ {%- if not loop.last %},{% endif -%}
119
+ {%- endfor -%}
120
+ {{- ']' -}}
121
+ {%- else -%}
122
+ {{- argument -}}
123
+ {%- endif -%}
124
+ {%- endmacro -%}
125
+ {{ bos_token }}
126
+ {%- set ns = namespace(prev_message_type=None) -%}
127
+ {#- Tool Declarations -#}
128
+ {%- set loop_messages = messages -%}
129
+ {%- if tools or messages[0]['role'] == 'system' or messages[0]['role'] == 'developer' -%}
130
+ {{- '<start_of_turn>developer\n' -}}
131
+ {%- if messages[0]['role'] == 'system' or messages[0]['role'] == 'developer' -%}
132
+ {%- if messages[0]['content'] is string -%}
133
+ {{- messages[0]['content'] | trim -}}
134
+ {%- elif messages[0]['content'] is sequence -%}
135
+ {%- for item in messages[0]['content'] -%}
136
+ {%- if item['type'] == 'text' -%}
137
+ {{- item['text'] | trim -}}
138
+ {%- endif -%}
139
+ {%- endfor -%}
140
+ {%- endif -%}
141
+ {%- set loop_messages = messages[1:] -%}
142
+ {%- endif -%}
143
+ {%- if tools -%}
144
+ {%- for tool in tools %}
145
+ {{- '<start_function_declaration>' -}}
146
+ {{- format_function_declaration(tool) | trim }}
147
+ {{- '<end_function_declaration>' -}}
148
+ {%- endfor %}
149
+ {%- endif -%}
150
+ {{- '<end_of_turn>\n' }}
151
+ {%- endif %}
152
+ {#- Loop through messages. -#}
153
+ {%- for message in loop_messages -%}
154
+ {%- if (message['role'] == 'assistant') -%}
155
+ {#- Rename "assistant" to "model". -#}
156
+ {%- set role = "model" -%}
157
+ {%- else -%}
158
+ {%- set role = message['role'] -%}
159
+ {%- endif -%}
160
+ {%- if role != 'tool' -%}
161
+ {%- if ns.prev_message_type != 'tool_response' -%}
162
+ {{- '<start_of_turn>' + role + '\n' }}
163
+ {%- endif -%}
164
+ {%- set ns.prev_message_type = None -%}
165
+ {%- if 'content' in message and message['content'] is not none -%}
166
+ {%- if message['content'] is string -%}
167
+ {{ message['content'] | trim }}
168
+ {%- elif message['content'] is sequence -%}
169
+ {%- for item in message['content'] -%}
170
+ {%- if item['type'] == 'image' -%}
171
+ {{ '<start_of_image>' }}
172
+ {%- elif item['type'] == 'text' -%}
173
+ {{ item['text'] | trim }}
174
+ {%- endif -%}
175
+ {%- endfor -%}
176
+ {%- else -%}
177
+ {{ raise_exception("Invalid content type in user/assistant message") }}
178
+ {%- endif -%}
179
+ {%- set ns.prev_message_type = 'content' -%}
180
+ {%- endif -%}
181
+ {%- if 'tool_calls' in message and message['tool_calls'] and message['tool_calls'] is iterable -%}
182
+ {#- Tool Calls -#}
183
+ {%- for tool_call in message['tool_calls'] -%}
184
+ {% set function = tool_call['function'] %}
185
+ {{- '<start_function_call>call:' + function['name'] + '{' -}}
186
+ {%- if 'arguments' in function -%}
187
+ {%- if function['arguments'] is mapping -%}
188
+ {%- set ns = namespace(found_first=false) -%}
189
+ {%- for key, value in function['arguments'] | dictsort -%}
190
+ {%- if ns.found_first %},{% endif -%}
191
+ {%- set ns.found_first = true -%}
192
+ {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
193
+ {%- endfor -%}
194
+ {%- elif function['arguments'] is string -%}
195
+ {# This handles string-JSON, just in case #}
196
+ {{ function['arguments'] }}
197
+ {%- endif %}
198
+ {%- endif -%}
199
+ {{- '}<end_function_call>' -}}
200
+ {%- endfor -%}
201
+ {%- if loop.last -%}
202
+ {{ '<start_function_response>' }}
203
+ {%- endif -%}
204
+ {%- set ns.prev_message_type = 'tool_call' -%}
205
+ {%- endif -%}
206
+ {%- else -%}
207
+ {#- Tool Responses -#}
208
+ {%- if 'content' in message and message['content'] -%}
209
+ {%- if message['content'] is mapping -%}
210
+ {%- if 'name' in message['content'] and 'response' in message['content'] -%}
211
+ {{ '<start_function_response>response:' + message['content']['name'] | trim + '{' }}
212
+ {%- set response_ns = namespace(found_first=false) -%}
213
+ {%- for key, value in message['content']['response'] | dictsort -%}
214
+ {%- if response_ns.found_first %},{% endif -%}
215
+ {%- set response_ns.found_first = true -%}
216
+ {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
217
+ {%- endfor -%}
218
+ {{- '}<end_function_response>' -}}
219
+ {%- elif 'name' in message -%}
220
+ {{ '<start_function_response>response:' + message['name'] | trim + '{' }}
221
+ {%- set response_ns = namespace(found_first=false) -%}
222
+ {%- for key, value in message['content'] | dictsort -%}
223
+ {%- if response_ns.found_first %},{% endif -%}
224
+ {%- set response_ns.found_first = true -%}
225
+ {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
226
+ {%- endfor -%}
227
+ {{- '}<end_function_response>' -}}
228
+ {%- else -%}
229
+ {{ raise_exception("Invalid tool response mapping: must contain 'name' and 'response' keys, or 'name' must be in the message.") }}
230
+ {%- endif -%}
231
+ {%- elif message['content'] is string -%}
232
+ {%- if 'name' in message -%}
233
+ {{ '<start_function_response>response:' + message['name'] | trim + '{value:' + format_argument(message['content'], escape_keys=False) + '}<end_function_response>' }}
234
+ {%- else -%}
235
+ {{ raise_exception("Invalid tool response: 'name' must be provided.") }}
236
+ {%- endif -%}
237
+ {%- elif message['content'] is sequence -%}
238
+ {%- for item in message['content'] -%}
239
+ {%- if item is mapping -%}
240
+ {%- if 'name' in item and 'response' in item -%}
241
+ {{ '<start_function_response>response:' + item['name'] | trim + '{' }}
242
+ {%- set response_ns = namespace(found_first=false) -%}
243
+ {%- for key, value in item['response'] | dictsort -%}
244
+ {%- if response_ns.found_first %},{% endif -%}
245
+ {%- set response_ns.found_first = true -%}
246
+ {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
247
+ {%- endfor -%}
248
+ {{- '}<end_function_response>' -}}
249
+ {%- elif 'name' in message -%}
250
+ {{ '<start_function_response>response:' + message['name'] | trim + '{' }}
251
+ {%- set response_ns = namespace(found_first=false) -%}
252
+ {%- for key, value in item | dictsort -%}
253
+ {%- if response_ns.found_first %},{% endif -%}
254
+ {%- set response_ns.found_first = true -%}
255
+ {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
256
+ {%- endfor -%}
257
+ {{- '}<end_function_response>' -}}
258
+ {%- else -%}
259
+ {{ raise_exception("Invalid tool response mapping: must contain 'name' and 'response' keys, or 'name' must be in the message.") }}
260
+ {%- endif -%}
261
+ {%- else -%}
262
+ {{ raise_exception("Invalid tool response message: multiple responses must all be mappings") }}
263
+ {%- endif -%}
264
+ {%- endfor -%}
265
+ {%- else -%}
266
+ {{ raise_exception("Invalid content type in tool message: must be mapping, sequence of mappings, or string.") }}
267
+ {%- endif -%}
268
+ {%- endif -%}
269
+ {%- set ns.prev_message_type = 'tool_response' -%}
270
+ {%- endif -%}
271
+ {%- if ns.prev_message_type not in ['tool_call', 'tool_response'] -%}
272
+ {{ '<end_of_turn>\n' }}
273
+ {%- endif -%}
274
+ {%- endfor -%}
275
+ {%- if add_generation_prompt -%}
276
+ {%- if ns.prev_message_type != 'tool_response' -%}
277
+ {{- '<start_of_turn>model\n' -}}
278
+ {%- endif -%}
279
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_sliding_window_pattern": 6,
3
+ "architectures": [
4
+ "Gemma3ForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "attn_logit_softcapping": null,
9
+ "bos_token_id": 2,
10
+ "dtype": "bfloat16",
11
+ "eos_token_id": 1,
12
+ "final_logit_softcapping": null,
13
+ "head_dim": 256,
14
+ "hidden_activation": "gelu_pytorch_tanh",
15
+ "hidden_size": 640,
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 2048,
18
+ "layer_types": [
19
+ "sliding_attention",
20
+ "sliding_attention",
21
+ "sliding_attention",
22
+ "sliding_attention",
23
+ "sliding_attention",
24
+ "full_attention",
25
+ "sliding_attention",
26
+ "sliding_attention",
27
+ "sliding_attention",
28
+ "sliding_attention",
29
+ "sliding_attention",
30
+ "full_attention",
31
+ "sliding_attention",
32
+ "sliding_attention",
33
+ "sliding_attention",
34
+ "sliding_attention",
35
+ "sliding_attention",
36
+ "full_attention"
37
+ ],
38
+ "max_position_embeddings": 32768,
39
+ "model_type": "gemma3_text",
40
+ "num_attention_heads": 4,
41
+ "num_hidden_layers": 18,
42
+ "num_key_value_heads": 1,
43
+ "pad_token_id": 0,
44
+ "query_pre_attn_scalar": 256,
45
+ "rms_norm_eps": 1e-06,
46
+ "rope_parameters": {
47
+ "full_attention": {
48
+ "rope_theta": 1000000.0,
49
+ "rope_type": "default"
50
+ },
51
+ "sliding_attention": {
52
+ "rope_theta": 10000.0,
53
+ "rope_type": "default"
54
+ }
55
+ },
56
+ "sliding_window": 512,
57
+ "transformers_version": "5.0.0.dev0",
58
+ "use_bidirectional_attention": false,
59
+ "use_cache": true,
60
+ "vocab_size": 262144,
61
+ "transformers.js_config": {
62
+ "use_external_data_format": {
63
+ "model.onnx": 1,
64
+ "model_fp16.onnx": 1,
65
+ "model_q4.onnx": 1,
66
+ "model_q4f16.onnx": 1
67
+ },
68
+ "kv_cache_dtype": {
69
+ "q4f16": "float16",
70
+ "fp16": "float16"
71
+ }
72
+ }
73
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cache_implementation": "hybrid",
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 1,
6
+ 50,
7
+ 106
8
+ ],
9
+ "top_k": 64,
10
+ "top_p": 0.95,
11
+ "transformers_version": "5.0.0.dev0",
12
+ "trust_remote_code": false
13
+ }
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07c362225485a837effc6e21834b76fe542c861f522944063dd757a33336dd3f
3
+ size 502654
onnx/model.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b513fd3a9d1633a21ebbc4330e0c193a7e118915d21f17079fe672a7dfb546b6
3
+ size 1139501568
onnx/model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16bf5d6e9249e11e52306a3dc7fc8a9b2e85eac1a7e3f9e884614d4eb6465b4c
3
+ size 619409
onnx/model_fp16.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4360c8c74dd9d2315c7a367baa65383338af0d51241632ab403fc00bb57c375
3
+ size 569862656
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3f12f387ee22d2b8c1b308bb6b2967ceb7c3466688950c5d7ccee29620ecede
3
+ size 430147
onnx/model_q4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d717b24f73233099ff714cd75abf8990fbb17ef1c75bd026a629800d3e3e3ec
3
+ size 801090048
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8dc9fb5e2b0aa34f527309f0ecaeb9b824b5ad9a9613350168753054c180e145
3
+ size 518626
onnx/model_q4f16.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b30ca95e4b31014ec791d7589f8c6416b8056ffc4f39093aa7ceb3ad37f2a0c7
3
+ size 425724416
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69fde4ada54844b6a7b94494e97f93c581c80cc6610c87e7b45d223077542169
3
+ size 20316979
tokenizer_config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": null,
3
+ "backend": "tokenizers",
4
+ "boi_token": "<start_of_image>",
5
+ "bos_token": "<bos>",
6
+ "clean_up_tokenization_spaces": false,
7
+ "eoi_token": "<end_of_image>",
8
+ "eos_token": "<eos>",
9
+ "image_token": "<image_soft_token>",
10
+ "is_local": false,
11
+ "mask_token": "<mask>",
12
+ "model_max_length": 1000000000000000019884624838656,
13
+ "model_specific_special_tokens": {
14
+ "boi_token": "<start_of_image>",
15
+ "eoi_token": "<end_of_image>",
16
+ "image_token": "<image_soft_token>",
17
+ "sfr_token": "<start_function_response>"
18
+ },
19
+ "pad_token": "<pad>",
20
+ "padding_side": "left",
21
+ "sfr_token": "<start_function_response>",
22
+ "sp_model_kwargs": null,
23
+ "spaces_between_special_tokens": false,
24
+ "tokenizer_class": "GemmaTokenizer",
25
+ "unk_token": "<unk>",
26
+ "use_default_system_prompt": false,
27
+ "chat_template": "{%- macro format_parameters(properties, required) -%}\n {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}\n {%- set ns = namespace(found_first=false) -%}\n {%- for key, value in properties | dictsort -%}\n {%- if key not in standard_keys -%}\n {%- if ns.found_first %},{% endif -%}\n {%- set ns.found_first = true -%}\n {{- key }}:{description:<escape>{{ value['description'] }}<escape>\n {%- if value['type'] | upper == 'STRING' -%}\n {%- if value['enum'] -%}\n ,enum:{{ format_argument(value['enum']) }}\n {%- endif -%}\n {%- elif value['type'] | upper == 'OBJECT' -%}\n ,properties:{\n {%- if value['properties'] is defined and value['properties'] is mapping -%}\n {{- format_parameters(value['properties'], value['required'] | default([])) -}}\n {%- elif value is mapping -%}\n {{- format_parameters(value, value['required'] | default([])) -}}\n {%- endif -%}\n }\n {%- if value['required'] -%}\n ,required:[\n {%- for item in value['required'] | default([]) -%}\n <escape>{{- item -}}<escape>\n {%- if not loop.last %},{% endif -%}\n {%- endfor -%}\n ]\n {%- endif -%}\n {%- elif value['type'] | upper == 'ARRAY' -%}\n {%- if value['items'] is mapping and value['items'] -%}\n ,items:{\n {%- set ns_items = namespace(found_first=false) -%}\n {%- for item_key, item_value in value['items'] | dictsort -%}\n {%- if item_value is not none -%}\n {%- if ns_items.found_first %},{% endif -%}\n {%- set ns_items.found_first = true -%}\n {%- if item_key == 'properties' -%}\n properties:{\n {%- if item_value is mapping -%}\n {{- format_parameters(item_value, value['items']['required'] | default([])) -}}\n {%- endif -%}\n }\n {%- elif item_key == 'required' -%}\n required:[\n {%- for req_item in item_value -%}\n <escape>{{- req_item -}}<escape>\n {%- if not loop.last %},{% endif -%}\n {%- endfor -%}\n ]\n {%- elif item_key == 'type' -%}\n {%- if item_value is string -%}\n type:{{ format_argument(item_value | upper) }}\n {%- else -%}\n type:{{ format_argument(item_value | map('upper') | list) }}\n {%- endif -%}\n {%- else -%}\n {{ item_key }}:{{ format_argument(item_value) }}\n {%- endif -%}\n {%- endif -%}\n {%- endfor -%}\n }\n {%- endif -%}\n {%- endif -%}\n ,type:<escape>{{ value['type'] | upper }}<escape>}\n {%- endif -%}\n {%- endfor -%}\n{%- endmacro -%}\n{% macro format_function_declaration(tool_data) -%}\ndeclaration:{{- tool_data['function']['name'] -}}\n{description:<escape>{{- tool_data['function']['description'] -}}<escape>\n{%- set params = tool_data['function']['parameters'] -%}\n{%- if params -%}\n ,parameters:{\n {%- if params['properties'] -%}\n properties:{ {{- format_parameters(params['properties'], params['required']) -}} },\n {%- endif -%}\n {%- if params['required'] -%}\n required:[\n {%- for item in params['required'] -%}\n <escape>{{- item -}}<escape>\n {{- ',' if not loop.last -}}\n {%- endfor -%}\n ],\n {%- endif -%}\n {%- if params['type'] -%}\n type:<escape>{{- params['type'] | upper -}}<escape>}\n {%- endif -%}\n{%- endif -%}\n}\n{%- endmacro -%}\n{% macro format_argument(argument, escape_keys=True) -%}\n{%- if argument is string -%}\n {{- '<escape>' + argument + '<escape>' -}}\n{%- elif argument is boolean -%}\n {%- if argument -%}\n {{- 'true' -}}\n {%- else -%}\n {{- 'false' -}}\n {%- endif -%}\n{%- elif argument is mapping -%}\n {{- '{' -}}\n {%- set ns = namespace(found_first=false) -%}\n {%- for key, value in argument | dictsort -%}\n {%- if ns.found_first %},{% endif -%}\n {%- set ns.found_first = true -%}\n {%- if escape_keys -%}\n {{- '<escape>' + key + '<escape>' -}}\n {%- else -%}\n {{- key -}}\n {%- endif -%}\n :{{- format_argument(value, escape_keys=escape_keys) -}}\n {%- endfor -%}\n {{- '}' -}}\n{%- elif argument is iterable -%}\n {{- '[' -}}\n {%- for item in argument -%}\n {{- format_argument(item, escape_keys=escape_keys) -}}\n {%- if not loop.last %},{% endif -%}\n {%- endfor -%}\n {{- ']' -}}\n{%- else -%}\n {{- argument -}}\n{%- endif -%}\n{%- endmacro -%}\n{{ bos_token }}\n{%- set ns = namespace(prev_message_type=None) -%}\n{#- Tool Declarations -#}\n{%- set loop_messages = messages -%}\n{%- if tools or messages[0]['role'] == 'system' or messages[0]['role'] == 'developer' -%}\n {{- '<start_of_turn>developer\\n' -}}\n {%- if messages[0]['role'] == 'system' or messages[0]['role'] == 'developer' -%}\n {%- if messages[0]['content'] is string -%}\n {{- messages[0]['content'] | trim -}}\n {%- elif messages[0]['content'] is iterable -%}\n {%- for item in messages[0]['content'] -%}\n {%- if item['type'] == 'text' -%}\n {{- item['text'] | trim -}}\n {%- endif -%}\n {%- endfor -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n {%- endif -%}\n {%- if tools -%}\n {%- for tool in tools %}\n {{- '<start_function_declaration>' -}}\n {{- format_function_declaration(tool) | trim }}\n {{- '<end_function_declaration>' -}}\n {%- endfor %}\n {%- endif -%}\n {{- '<end_of_turn>\\n' }}\n{%- endif %}\n{#- Loop through messages. -#}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'assistant') -%}\n {#- Rename \"assistant\" to \"model\". -#}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {%- if role != 'tool' -%}\n {%- if ns.prev_message_type != 'tool_response' -%}\n {{- '<start_of_turn>' + role + '\\n' }}\n {%- endif -%}\n {%- set ns.prev_message_type = None -%}\n {%- if 'content' in message and message['content'] is not none -%}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type in user/assistant message\") }}\n {%- endif -%}\n {%- set ns.prev_message_type = 'content' -%}\n {%- endif -%}\n {%- if 'tool_calls' in message and message['tool_calls'] and message['tool_calls'] is iterable -%}\n {#- Tool Calls -#}\n {%- for tool_call in message['tool_calls'] -%}\n {% set function = tool_call['function'] %}\n {{- '<start_function_call>call:' + function['name'] + '{' -}}\n {%- if 'arguments' in function -%}\n {%- if function['arguments'] is mapping -%}\n {%- set ns = namespace(found_first=false) -%}\n {%- for key, value in function['arguments'] | dictsort -%}\n {%- if ns.found_first %},{% endif -%}\n {%- set ns.found_first = true -%}\n {{- key -}}:{{- format_argument(value, escape_keys=False) -}}\n {%- endfor -%}\n {%- elif function['arguments'] is string -%}\n {# This handles string-JSON, just in case #}\n {{ function['arguments'] }}\n {%- endif %}\n {%- endif -%}\n {{- '}<end_function_call>' -}}\n {%- endfor -%}\n {%- if loop.last -%}\n {{ '<start_function_response>' }}\n {%- endif -%}\n {%- set ns.prev_message_type = 'tool_call' -%}\n {%- endif -%}\n {%- else -%}\n {#- Tool Responses -#}\n {%- if 'content' in message and message['content'] -%}\n {%- if message['content'] is mapping -%}\n {%- if 'name' in message['content'] and 'response' in message['content'] -%}\n {{ '<start_function_response>response:' + message['content']['name'] | trim + '{' }}\n {%- set response_ns = namespace(found_first=false) -%}\n {%- for key, value in message['content']['response'] | dictsort -%}\n {%- if response_ns.found_first %},{% endif -%}\n {%- set response_ns.found_first = true -%}\n {{- key -}}:{{- format_argument(value, escape_keys=False) -}}\n {%- endfor -%}\n {{- '}<end_function_response>' -}}\n {%- elif 'name' in message -%}\n {{ '<start_function_response>response:' + message['name'] | trim + '{' }}\n {%- set response_ns = namespace(found_first=false) -%}\n {%- for key, value in message['content'] | dictsort -%}\n {%- if response_ns.found_first %},{% endif -%}\n {%- set response_ns.found_first = true -%}\n {{- key -}}:{{- format_argument(value, escape_keys=False) -}}\n {%- endfor -%}\n {{- '}<end_function_response>' -}}\n {%- else -%}\n {{ raise_exception(\"Invalid tool response mapping: must contain 'name' and 'response' keys, or 'name' must be in the message.\") }}\n {%- endif -%}\n {%- elif message['content'] is string -%}\n {%- if 'name' in message -%}\n {{ '<start_function_response>response:' + message['name'] | trim + '{value:' + format_argument(message['content'], escape_keys=False) + '}<end_function_response>' }}\n {%- else -%}\n {{ raise_exception(\"Invalid tool response: 'name' must be provided.\") }}\n {%- endif -%}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item is mapping -%}\n {%- if 'name' in item and 'response' in item -%}\n {{ '<start_function_response>response:' + item['name'] | trim + '{' }}\n {%- set response_ns = namespace(found_first=false) -%}\n {%- for key, value in item['response'] | dictsort -%}\n {%- if response_ns.found_first %},{% endif -%}\n {%- set response_ns.found_first = true -%}\n {{- key -}}:{{- format_argument(value, escape_keys=False) -}}\n {%- endfor -%}\n {{- '}<end_function_response>' -}}\n {%- elif 'name' in message -%}\n {{ '<start_function_response>response:' + message['name'] | trim + '{' }}\n {%- set response_ns = namespace(found_first=false) -%}\n {%- for key, value in item | dictsort -%}\n {%- if response_ns.found_first %},{% endif -%}\n {%- set response_ns.found_first = true -%}\n {{- key -}}:{{- format_argument(value, escape_keys=False) -}}\n {%- endfor -%}\n {{- '}<end_function_response>' -}}\n {%- else -%}\n {{ raise_exception(\"Invalid tool response mapping: must contain 'name' and 'response' keys, or 'name' must be in the message.\") }}\n {%- endif -%}\n {%- else -%}\n {{ raise_exception(\"Invalid tool response message: multiple responses must all be mappings\") }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type in tool message: must be mapping, sequence of mappings, or string.\") }}\n {%- endif -%}\n {%- endif -%}\n {%- set ns.prev_message_type = 'tool_response' -%}\n {%- endif -%}\n {%- if ns.prev_message_type not in ['tool_call', 'tool_response'] -%}\n {{ '<end_of_turn>\\n' }}\n {%- endif -%}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {%- if ns.prev_message_type != 'tool_response' -%}\n {{- '<start_of_turn>model\\n' -}}\n {%- endif -%}\n{%- endif -%}\n"
28
+ }