alexmarques commited on
Commit
68907fb
·
verified ·
1 Parent(s): b3cbe11

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tekken.json filter=lfs diff=lfs merge=lfs -text
37
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
38
+ images/image1.png filter=lfs diff=lfs merge=lfs -text
39
+ images/image2.png filter=lfs diff=lfs merge=lfs -text
40
+ images/image3.png filter=lfs diff=lfs merge=lfs -text
41
+ images/aime.png filter=lfs diff=lfs merge=lfs -text
42
+ images/lcr.png filter=lfs diff=lfs merge=lfs -text
43
+ images/livecode.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,519 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - fr
6
+ - de
7
+ - es
8
+ - pt
9
+ - it
10
+ - ja
11
+ - ko
12
+ - ru
13
+ - zh
14
+ - ar
15
+ - fa
16
+ - id
17
+ - ms
18
+ - ne
19
+ - pl
20
+ - ro
21
+ - sr
22
+ - sv
23
+ - tr
24
+ - uk
25
+ - vi
26
+ - hi
27
+ - bn
28
+ tags:
29
+ - vLLM
30
+ ---
31
+
32
+ # Mistral Small 4 119B A6B
33
+
34
+ Mistral Small 4 is a powerful hybrid model capable of acting as both a general instruction model and a reasoning model. It unifies the capabilities of three different model families—**Instruct**, **Reasoning** (previously called Magistral), and **Devstral**—into a single, unified model.
35
+
36
+ With its multimodal capabilities, efficient architecture, and flexible mode switching, it is a powerful general-purpose model for any task. In a latency-optimized setup, Mistral Small 4 achieves a **40% reduction in end-to-end completion time**, and in a throughput-optimized setup, it handles **3x more requests per second** compared to Mistral Small 3.
37
+
38
+ To further improve efficiency you can either take advantages of:
39
+ - Speculative decoding thanks to our trained eagle head [`mistralai/Mistral-Small-4-119B-2603-eagle`](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603-eagle).
40
+ - 4 bit float precision quantization thanks to our NVFP4 checkpoint [`mistralai/Mistral-Small-4-119B-2603-NVFP4`](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603-NVFP4).
41
+
42
+ ## Key Features
43
+
44
+ Mistral Small 4 includes the following architectural choices:
45
+
46
+ - **MoE**: 128 experts, 4 active.
47
+ - **119B parameters**, with **6.5B activated per token**.
48
+ - **256k context length**.
49
+ - **Multimodal input**: Accepts both text and image input, with text output.
50
+ - **Instruct and Reasoning functionalities** with function calls (reasoning effort configurable per request).
51
+
52
+ Mistral Small 4 offers the following capabilities:
53
+
54
+ - **Reasoning Mode**: Toggle between fast instant reply mode and reasoning mode, boosting performance with test-time compute when requested.
55
+ - **Vision**: Analyzes images and provides insights based on visual content, in addition to text.
56
+ - **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, and Arabic.
57
+ - **System Prompt**: Strong adherence and support for system prompts.
58
+ - **Agentic**: Best-in-class agentic capabilities with native function calling and JSON output.
59
+ - **Speed-Optimized**: Delivers best-in-class performance and speed.
60
+ - **Apache 2.0 License**: Open-source license for both commercial and non-commercial use.
61
+ - **Large Context Window**: Supports a 256k context window.
62
+
63
+ ## Recommended Settings
64
+
65
+ - **Reasoning Effort**:
66
+ - `'none'` → Do not use reasoning
67
+ - `'high'` → Use reasoning (recommended for complex prompts)
68
+ Use `reasoning_effort="high"` for complex tasks
69
+ - **Temperature**: 0.7 for `reasoning_effort="high"`. Temp between 0.0 and 0.7 for `reasoning_effort="none"` depending on task.
70
+
71
+ ## Use Cases
72
+
73
+ Mistral Small 4 is designed for general chat assistants, coding, agentic tasks, and reasoning tasks (with reasoning mode toggled). Its multimodal capabilities also enable document and image understanding for data extraction and analysis.
74
+
75
+ Its capabilities are ideal for:
76
+ - Developers interested in coding and agentic capabilities for SWE automation and codebase exploration.
77
+ - Enterprises seeking general chat assistants, agents, and document understanding.
78
+ - Researchers leveraging its math and research capabilities.
79
+
80
+ Mistral Small 4 is also well-suited for customization and fine-tuning for more specialized tasks.
81
+
82
+ ### Examples
83
+ - General chat assistant
84
+ - Document parsing and extraction
85
+ - Coding agent
86
+ - Research assistant
87
+ - Customization & fine-tuning
88
+ - And more...
89
+
90
+ ## Benchmarks
91
+
92
+ ### Comparison with internal models
93
+
94
+ Depending on your tasks you can trigger reasoning thanks to the support of the **per-request** parameter `reasoning_effort`. Set it to:
95
+ - `reasoning_effort="none"`: Fast, lightweight responses for everyday tasks, equivalent to the same chat style of [`mistralai/Mistral-Small-3.2-24B-Instruct-2506`](https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506).
96
+ - `reasoning_effort="high"`: Deep, step-by-step reasoning for complex problems, with equivalent verbosity to previous Magistral models such as [`mistralai/Magistral-Small-2509`](https://huggingface.co/mistralai/Magistral-Small-2509).
97
+
98
+ ![Internal benchmark](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/resolve/main/images/image2.png)
99
+
100
+ #### Comparing Reasoning Models
101
+
102
+ ![Internal benchmark - Reasoning](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/resolve/main/images/image3.png)
103
+
104
+
105
+ ### Comparison with other models
106
+
107
+ Mistral Small 4 with reasoning achieves competitive scores, matching or surpassing GPT-OSS 120B across all three benchmarks while generating significantly
108
+ shorter outputs. On AA LCR, Mistral Small 4 scores **0.72** with just **1.6K characters**, whereas Qwen models require **3.5-4x more output** (5.8-6.1K)
109
+ for comparable performance. On LiveCodeBench, Mistral Small 4 outperforms GPT-OSS 120B while producing **20% less output**.
110
+ This efficiency reduces latency, inference costs, and improves user experience.
111
+
112
+ ![Comparison benchmark - LCR](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/resolve/main/images/lcr.png)
113
+ ![Comparison benchmark - LiveCodeBench](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/resolve/main/images/livecode.png)
114
+ ![Comparison benchmark - AIME25](https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/resolve/main/images/aime.png)
115
+
116
+ ## Usage
117
+
118
+ You can find Mistral Small 4 support on multiple libraries for inference and fine-tuning. We here thank everyone contributors and maintainers that helped us making it happen.
119
+
120
+ ### Inference
121
+
122
+ The model can be deployed with:
123
+ - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm-recommended)
124
+ - [`llama.cpp`](https://github.com/ggml-org/llama.cpp): See [here](https://huggingface.co/unsloth/Mistral-Small-4-119B-2603-GGUF) for Unsloth's GGUFs
125
+ - [`LM studio`](https://lmstudio.ai/): See [here](https://lmstudio.ai/models/mistralai/mistral-small-4)
126
+ - [`SGLang`](https://github.com/sgl-project/sglang): See [here](https://docs.sglang.io/basic_usage/send_request.html)
127
+ - [`transformers`](https://github.com/huggingface/transformers): See [here](#transformers)
128
+
129
+ For optimal performance, we recommend using the Mistral AI API if local serving is subpar.
130
+
131
+ ### Fine-Tuning
132
+
133
+ Fine-tune the model via:
134
+ - [`Axolotl`](https://github.com/axolotl-ai-cloud/axolotl): See [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/mistral4).
135
+
136
+ ## vLLM (Recommended)
137
+
138
+ We recommend using Mistral Small 4 with the [vLLM library](https://github.com/vllm-project/vllm) for production-ready inference.
139
+
140
+ ### Installation
141
+
142
+ > [!Tip]
143
+ > Use our custom Docker image with fixes for tool calling and reasoning parsing in vLLM, and the latest Transformers version. We are working with the vLLM team to merge these fixes soon.
144
+
145
+ **Custom Docker**
146
+ Use the following Docker image: [`mistralllm/vllm-ms4:latest`](https://hub.docker.com/repository/docker/mistralllm/vllm-ms4/latest/):
147
+ ```bash
148
+ docker pull mistralllm/vllm-ms4:latest
149
+ docker run -it mistralllm/vllm-ms4:latest
150
+ ```
151
+
152
+ **Manual Install**
153
+ Alternatively, install `vllm` from this PR: [Add Mistral Guidance](https://github.com/vllm-project/vllm/pull/37081).
154
+
155
+ > **Note**: This PR is expected to be merged into `vllm` main in the next 1-2 weeks (as of 16.03.2026). Track updates [here](https://github.com/vllm-project/vllm/pull/37081).
156
+
157
+ 1. Clone vLLM:
158
+ ```bash
159
+ git clone --branch fix_mistral_parsing https://github.com/juliendenize/vllm.git
160
+ ```
161
+ 2. Install with pre-compiled kernels:
162
+ ```bash
163
+ VLLM_USE_PRECOMPILED=1 pip install --editable .
164
+ ```
165
+ 3. Install `transformers` from main:
166
+ ```bash
167
+ uv pip install git+https://github.com/huggingface/transformers.git
168
+ ```
169
+ Ensure [`mistral_common >= 1.10.0`](https://github.com/mistralai/mistral-common/releases/tag/v1.10.0) is installed:
170
+ ```bash
171
+ python -c "import mistral_common; print(mistral_common.__version__)"
172
+ ```
173
+
174
+ ### Serve the Model
175
+
176
+ We recommend a server/client setup:
177
+ ```bash
178
+ vllm serve mistralai/Mistral-Small-4-119B-2603 --max-model-len 262144 --tensor-parallel-size 2 --attention-backend FLASH_ATTN_MLA \
179
+ --tool-call-parser mistral --enable-auto-tool-choice --reasoning-parser mistral --max_num_batched_tokens 16384 --max_num_seqs 128 \
180
+ --gpu_memory_utilization 0.8
181
+ ```
182
+
183
+ ### Ping the Server
184
+
185
+ <details>
186
+ <summary>Instruction Following</summary>
187
+
188
+ Mistral Small 4 can follow your instructions to the letter.
189
+
190
+
191
+ ```python
192
+ from datetime import datetime, timedelta
193
+
194
+ from openai import OpenAI
195
+ from huggingface_hub import hf_hub_download
196
+
197
+ # Modify OpenAI's API key and API base to use vLLM's API server.
198
+ openai_api_key = "EMPTY"
199
+ openai_api_base = "http://localhost:8000/v1"
200
+
201
+ TEMP = 0.1
202
+ # use TEMP = 0.7 for reasoning="high"
203
+
204
+ client = OpenAI(
205
+ api_key=openai_api_key,
206
+ base_url=openai_api_base,
207
+ )
208
+
209
+ models = client.models.list()
210
+ model = models.data[0].id
211
+
212
+
213
+ def load_system_prompt(repo_id: str, filename: str) -> str:
214
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
215
+ with open(file_path, "r") as file:
216
+ system_prompt = file.read()
217
+ today = datetime.today().strftime("%Y-%m-%d")
218
+ yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
219
+ model_name = repo_id.split("/")[-1]
220
+ return system_prompt.format(name=model_name, today=today, yesterday=yesterday)
221
+
222
+
223
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
224
+
225
+ messages = [
226
+ {"role": "system", "content": SYSTEM_PROMPT},
227
+ {
228
+ "role": "user",
229
+ "content": "Write me a sentence where every word starts with the next letter in the alphabet - start with 'a' and end with 'z'.",
230
+ },
231
+ ]
232
+
233
+ response = client.chat.completions.create(
234
+ model=model,
235
+ messages=messages,
236
+ temperature=TEMP,
237
+ reasoning_effort="none",
238
+ )
239
+
240
+ assistant_message = response.choices[0].message.content
241
+ print(assistant_message)
242
+ ```
243
+
244
+ </details>
245
+
246
+ <details>
247
+ <summary>Tool Call</summary>
248
+
249
+ Let's solve some equations thanks to our simple Python calculator tool.
250
+
251
+
252
+ ```python
253
+ import json
254
+ from datetime import datetime, timedelta
255
+
256
+ from openai import OpenAI
257
+ from huggingface_hub import hf_hub_download
258
+
259
+ # Modify OpenAI's API key and API base to use vLLM's API server.
260
+ openai_api_key = "EMPTY"
261
+ openai_api_base = "http://localhost:8000/v1"
262
+
263
+ TEMP = 0.1
264
+
265
+ client = OpenAI(
266
+ api_key=openai_api_key,
267
+ base_url=openai_api_base,
268
+ )
269
+
270
+ models = client.models.list()
271
+ model = models.data[0].id
272
+
273
+
274
+ def load_system_prompt(repo_id: str, filename: str) -> str:
275
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
276
+ with open(file_path, "r") as file:
277
+ system_prompt = file.read()
278
+ today = datetime.today().strftime("%Y-%m-%d")
279
+ yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
280
+ model_name = repo_id.split("/")[-1]
281
+ return system_prompt.format(name=model_name, today=today, yesterday=yesterday)
282
+
283
+
284
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
285
+
286
+ image_url = "https://math-coaching.com/img/fiche/46/expressions-mathematiques.jpg"
287
+
288
+
289
+ def my_calculator(expression: str) -> str:
290
+ return str(eval(expression))
291
+
292
+
293
+ tools = [
294
+ {
295
+ "type": "function",
296
+ "function": {
297
+ "name": "my_calculator",
298
+ "description": "A calculator that can evaluate a mathematical expression.",
299
+ "parameters": {
300
+ "type": "object",
301
+ "properties": {
302
+ "expression": {
303
+ "type": "string",
304
+ "description": "The mathematical expression to evaluate.",
305
+ },
306
+ },
307
+ "required": ["expression"],
308
+ },
309
+ },
310
+ },
311
+ {
312
+ "type": "function",
313
+ "function": {
314
+ "name": "rewrite",
315
+ "description": "Rewrite a given text for improved clarity",
316
+ "parameters": {
317
+ "type": "object",
318
+ "properties": {
319
+ "text": {
320
+ "type": "string",
321
+ "description": "The input text to rewrite",
322
+ }
323
+ },
324
+ },
325
+ },
326
+ },
327
+ ]
328
+
329
+ messages = [
330
+ {"role": "system", "content": SYSTEM_PROMPT},
331
+ {
332
+ "role": "user",
333
+ "content": [
334
+ {
335
+ "type": "text",
336
+ "text": "Thanks to your calculator, compute the results for the equations that involve numbers displayed in the image.",
337
+ },
338
+ {
339
+ "type": "image_url",
340
+ "image_url": {
341
+ "url": image_url,
342
+ },
343
+ },
344
+ ],
345
+ },
346
+ ]
347
+
348
+ response = client.chat.completions.create(
349
+ model=model,
350
+ messages=messages,
351
+ temperature=TEMP,
352
+ tools=tools,
353
+ tool_choice="auto",
354
+ reasoning_effort="none",
355
+ )
356
+
357
+ tool_calls = response.choices[0].message.tool_calls
358
+
359
+ results = []
360
+ for tool_call in tool_calls:
361
+ function_name = tool_call.function.name
362
+ function_args = tool_call.function.arguments
363
+ if function_name == "my_calculator":
364
+ result = my_calculator(**json.loads(function_args))
365
+ results.append(result)
366
+
367
+ messages.append({"role": "assistant", "tool_calls": tool_calls})
368
+ for tool_call, result in zip(tool_calls, results):
369
+ messages.append(
370
+ {
371
+ "role": "tool",
372
+ "tool_call_id": tool_call.id,
373
+ "name": tool_call.function.name,
374
+ "content": result,
375
+ }
376
+ )
377
+
378
+
379
+ response = client.chat.completions.create(
380
+ model=model,
381
+ messages=messages,
382
+ temperature=TEMP,
383
+ reasoning_effort="none",
384
+ )
385
+
386
+ print(response.choices[0].message.content)
387
+ ```
388
+
389
+ </details>
390
+
391
+ <details>
392
+ <summary>Vision Reasoning</summary>
393
+
394
+ Let's see if the Mistral Small 4 knows when to pick a fight !
395
+
396
+ ```python
397
+ from datetime import datetime, timedelta
398
+
399
+ from openai import OpenAI
400
+ from huggingface_hub import hf_hub_download
401
+
402
+ # Modify OpenAI's API key and API base to use vLLM's API server.
403
+ openai_api_key = "EMPTY"
404
+ openai_api_base = "http://localhost:8000/v1"
405
+
406
+ TEMP = 0.7
407
+
408
+ client = OpenAI(
409
+ api_key=openai_api_key,
410
+ base_url=openai_api_base,
411
+ )
412
+
413
+ models = client.models.list()
414
+ model = models.data[0].id
415
+
416
+
417
+ def load_system_prompt(repo_id: str, filename: str) -> str:
418
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
419
+ with open(file_path, "r") as file:
420
+ system_prompt = file.read()
421
+ today = datetime.today().strftime("%Y-%m-%d")
422
+ yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
423
+ model_name = repo_id.split("/")[-1]
424
+ return system_prompt.format(name=model_name, today=today, yesterday=yesterday)
425
+
426
+
427
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
428
+ image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"
429
+
430
+ messages = [
431
+ {"role": "system", "content": SYSTEM_PROMPT},
432
+ {
433
+ "role": "user",
434
+ "content": [
435
+ {
436
+ "type": "text",
437
+ "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
438
+ },
439
+ {"type": "image_url", "image_url": {"url": image_url}},
440
+ ],
441
+ },
442
+ ]
443
+
444
+
445
+ response = client.chat.completions.create(
446
+ model=model,
447
+ messages=messages,
448
+ temperature=TEMP,
449
+ reasoning_effort="high",
450
+ )
451
+
452
+ print(response.choices[0].message.content)
453
+ ```
454
+
455
+ </details>
456
+
457
+ ## Transformers
458
+
459
+ ### Installation
460
+
461
+ You need to install the main branch of Transformers to use Mistral Small 4:
462
+
463
+ ```bash
464
+ uv pip install git+https://github.com/huggingface/transformers.git
465
+ ```
466
+
467
+ ### Inference
468
+
469
+ <details>
470
+ <summary>Python Inference Snippet</summary>
471
+
472
+ ```python
473
+ import torch
474
+ from transformers import AutoProcessor, Mistral3ForConditionalGeneration
475
+
476
+
477
+ model_id = "mistralai/Mistral-Small-4-119B-2603"
478
+
479
+ processor = AutoProcessor.from_pretrained(model_id)
480
+ model = Mistral3ForConditionalGeneration.from_pretrained(
481
+ model_id, device_map="auto"
482
+ )
483
+
484
+ image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"
485
+
486
+ messages = [
487
+ {
488
+ "role": "user",
489
+ "content": [
490
+ {
491
+ "type": "text",
492
+ "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
493
+ },
494
+ {"type": "image_url", "image_url": {"url": image_url}},
495
+ ],
496
+ },
497
+ ]
498
+
499
+ inputs = processor.apply_chat_template(messages, return_tensors="pt", tokenize=True, return_dict=True, reasoning_effort="high")
500
+ inputs = inputs.to(model.device)
501
+
502
+ output = model.generate(
503
+ **inputs,
504
+ max_new_tokens=1024,
505
+ do_sample=True,
506
+ temperature=0.7,
507
+ )[0]
508
+
509
+ # Setting `skip_special_tokens=False` to visualize reasoning trace between [THINK] [/THINK] tags.
510
+ decoded_output = processor.decode(output[len(inputs["input_ids"][0]):], skip_special_tokens=False)
511
+ print(decoded_output)
512
+ ```
513
+ </details>
514
+
515
+ ## License
516
+
517
+ This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
518
+
519
+ *You must not use this model in a manner that infringes, misappropriates, or violates any third party’s rights, including intellectual property rights.*
SYSTEM_PROMPT.txt ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ You are Mistral-Small-4-119B-2603, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.
2
+ You power an AI assistant called Le Chat.
3
+ Your knowledge base was last updated on Friday, November 1, 2024.
4
+ The current date is {today}.
5
+
6
+ When you're not sure about some information or when the user's request requires up-to-date or specific data, you must use the available tools to fetch the information. Do not hesitate to use tools whenever they can provide a more accurate or complete response. If no relevant tools are available, then clearly state that you don't have the information and avoid making up anything.
7
+ If the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. "What are some good restaurants around me?" => "Where are you?" or "When is the next flight to Tokyo" => "Where do you travel from?").
8
+ You are always very attentive to dates, in particular you try to resolve dates (e.g. "yesterday" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.
9
+ You follow these instructions in all languages, and always respond to the user in the language they use or request.
10
+ Next sections describe the capabilities that you have.
11
+
12
+ # WEB BROWSING INSTRUCTIONS
13
+
14
+ You cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.
15
+
16
+ # MULTI-MODAL INSTRUCTIONS
17
+
18
+ You have the ability to read images, but you cannot generate images. You also cannot read nor transcribe audio files or videos.
19
+
20
+ # TOOL CALLING INSTRUCTIONS
21
+
22
+ You may have access to tools that you can use to fetch information or perform actions. You must use these tools in the following situations:
23
+
24
+ 1. When the request requires up-to-date information.
25
+ 2. When the request requires specific data that you do not have in your knowledge base.
26
+ 3. When the request involves actions that you cannot perform without tools.
27
+
28
+ Always prioritize using tools to provide the most accurate and helpful response. If tools are not available, inform the user that you cannot perform the requested action at the moment.
chat_template.jinja ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {#- Default system message if no system prompt is passed. #}
2
+ {%- set default_system_message = '' %}
3
+
4
+ {#- Begin of sequence token. #}
5
+ {{- '<s>' }}
6
+
7
+ {#- Handle system prompt if it exists. #}
8
+ {#- System prompt supports text content or text chunks. #}
9
+ {%- if messages[0]['role'] == 'system' %}
10
+ {{- '[SYSTEM_PROMPT]' -}}
11
+ {%- if messages[0]['content'] is string %}
12
+ {{- messages[0]['content'] -}}
13
+ {%- else %}
14
+ {%- for block in messages[0]['content'] %}
15
+ {%- if block['type'] == 'text' %}
16
+ {{- block['text'] }}
17
+ {%- else %}
18
+ {{- raise_exception('Only text chunks are supported in system message contents.') }}
19
+ {%- endif %}
20
+ {%- endfor %}
21
+ {%- endif %}
22
+ {{- '[/SYSTEM_PROMPT]' -}}
23
+ {%- set loop_messages = messages[1:] %}
24
+ {%- else %}
25
+ {%- set loop_messages = messages %}
26
+ {%- if default_system_message != '' %}
27
+ {{- '[SYSTEM_PROMPT]' + default_system_message + '[/SYSTEM_PROMPT]' }}
28
+ {%- endif %}
29
+ {%- endif %}
30
+
31
+
32
+ {#- Tools definition #}
33
+ {%- set tools_definition = '' %}
34
+ {%- set has_tools = false %}
35
+ {%- if tools is defined and tools is not none and tools|length > 0 %}
36
+ {%- set has_tools = true %}
37
+ {%- set tools_definition = '[AVAILABLE_TOOLS]' + (tools| tojson) + '[/AVAILABLE_TOOLS]' %}
38
+ {{- tools_definition }}
39
+ {%- endif %}
40
+
41
+ {#- Model settings definition #}
42
+ {%- set reasoning_effort = reasoning_effort if reasoning_effort is defined and reasoning_effort is not none else 'none' %}
43
+ {%- if reasoning_effort not in ['none', 'high'] %}
44
+ {{- raise_exception('reasoning_effort must be either "none" or "high"') }}
45
+ {%- endif %}
46
+ {%- set model_settings = '[MODEL_SETTINGS]{"reasoning_effort": "' + reasoning_effort + '"}[/MODEL_SETTINGS]' %}
47
+ {{- model_settings }}
48
+
49
+ {#- Checks for alternating user/assistant messages. #}
50
+ {%- set ns = namespace(index=0) %}
51
+ {%- for message in loop_messages %}
52
+ {%- if message.role == 'user' or (message.role == 'assistant' and (message.tool_calls is not defined or message.tool_calls is none or message.tool_calls | length == 0)) %}
53
+ {%- if (message['role'] == 'user') != (ns.index % 2 == 0) %}
54
+ {{- raise_exception('After the optional system message, conversation roles must alternate user and assistant roles except for tool calls and results.') }}
55
+ {%- endif %}
56
+ {%- set ns.index = ns.index + 1 %}
57
+ {%- endif %}
58
+ {%- endfor %}
59
+
60
+ {#- Handle conversation messages. #}
61
+ {%- for message in loop_messages %}
62
+
63
+ {#- User messages supports text content or text and image chunks. #}
64
+ {%- if message['role'] == 'user' %}
65
+ {%- if message['content'] is string %}
66
+ {{- '[INST]' + message['content'] + '[/INST]' }}
67
+ {%- elif message['content'] | length > 0 %}
68
+ {{- '[INST]' }}
69
+ {%- if message['content'] | length == 2 %}
70
+ {%- set blocks = message['content'] | sort(attribute='type') %}
71
+ {%- else %}
72
+ {%- set blocks = message['content'] %}
73
+ {%- endif %}
74
+ {%- for block in blocks %}
75
+ {%- if block['type'] == 'text' %}
76
+ {{- block['text'] }}
77
+ {%- elif block['type'] in ['image', 'image_url'] %}
78
+ {{- '[IMG]' }}
79
+ {%- else %}
80
+ {{- raise_exception('Only text, image and image_url chunks are supported in user message content.') }}
81
+ {%- endif %}
82
+ {%- endfor %}
83
+ {{- '[/INST]' }}
84
+ {%- else %}
85
+ {{- raise_exception('User message must have a string or a list of chunks in content') }}
86
+ {%- endif %}
87
+
88
+ {#- Assistant messages supports text content or text, image and thinking chunks. #}
89
+ {%- elif message['role'] == 'assistant' %}
90
+ {%- if (message['content'] is none or message['content'] == '' or message['content']|length == 0) and (message['tool_calls'] is not defined or message['tool_calls'] is none or message['tool_calls']|length == 0) %}
91
+ {{- raise_exception('Assistant message must have a string or a list of chunks in content or a list of tool calls.') }}
92
+ {%- endif %}
93
+
94
+ {%- if message['content'] is string and message['content'] != '' %}
95
+ {{- message['content'] }}
96
+ {%- elif message['content'] | length > 0 %}
97
+ {%- for block in message['content'] %}
98
+ {%- if block['type'] == 'text' %}
99
+ {{- block['text'] }}
100
+ {%- elif block['type'] == 'thinking' %}
101
+ {{- '[THINK]' + block['thinking'] + '[/THINK]' }}
102
+ {%- else %}
103
+ {{- raise_exception('Only text and thinking chunks are supported in assistant message contents.') }}
104
+ {%- endif %}
105
+ {%- endfor %}
106
+ {%- endif %}
107
+
108
+ {%- if message['tool_calls'] is defined and message['tool_calls'] is not none and message['tool_calls']|length > 0 %}
109
+ {%- for tool in message['tool_calls'] %}
110
+ {{- '[TOOL_CALLS]' }}
111
+ {%- set name = tool['function']['name'] %}
112
+ {%- set arguments = tool['function']['arguments'] %}
113
+ {%- if arguments is not string %}
114
+ {%- set arguments = arguments|tojson|safe %}
115
+ {%- elif arguments == '' %}
116
+ {%- set arguments = '{}' %}
117
+ {%- endif %}
118
+ {{- name + '[ARGS]' + arguments }}
119
+ {%- endfor %}
120
+ {%- endif %}
121
+
122
+ {{- '</s>' }}
123
+
124
+ {#- Tool messages only supports text content. #}
125
+ {%- elif message['role'] == 'tool' %}
126
+ {{- '[TOOL_RESULTS]' + message['content']|string + '[/TOOL_RESULTS]' }}
127
+
128
+ {#- Raise exception for unsupported roles. #}
129
+ {%- else %}
130
+ {{- raise_exception('Only user, assistant and tool roles are supported, got ' + message['role'] + '.') }}
131
+ {%- endif %}
132
+ {%- endfor %}
config.json ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Mistral3ForConditionalGeneration"
4
+ ],
5
+ "dtype": "bfloat16",
6
+ "image_token_index": 10,
7
+ "model_type": "mistral3",
8
+ "multimodal_projector_bias": false,
9
+ "projector_hidden_act": "gelu",
10
+ "quantization_config": {
11
+ "activation_scheme": "static",
12
+ "dequantize": false,
13
+ "modules_to_not_convert": [
14
+ "model.vision_tower",
15
+ "model.multi_modal_projector",
16
+ "lm_head"
17
+ ],
18
+ "quant_method": "fp8",
19
+ "weight_block_size": null
20
+ },
21
+ "spatial_merge_size": 2,
22
+ "text_config": {
23
+ "attention_bias": false,
24
+ "attention_dropout": 0.0,
25
+ "bos_token_id": 1,
26
+ "eos_token_id": 2,
27
+ "first_k_dense_replace": 0,
28
+ "head_dim": 128,
29
+ "hidden_act": "silu",
30
+ "hidden_size": 4096,
31
+ "initializer_range": 0.02,
32
+ "intermediate_size": 12288,
33
+ "kv_lora_rank": 256,
34
+ "max_position_embeddings": 1048576,
35
+ "mlp_bias": false,
36
+ "model_type": "mistral4",
37
+ "moe_intermediate_size": 2048,
38
+ "n_group": 1,
39
+ "n_routed_experts": 128,
40
+ "n_shared_experts": 1,
41
+ "norm_topk_prob": true,
42
+ "num_attention_heads": 32,
43
+ "num_experts_per_tok": 4,
44
+ "num_hidden_layers": 36,
45
+ "num_key_value_heads": 32,
46
+ "pad_token_id": 11,
47
+ "pretraining_tp": 1,
48
+ "q_lora_rank": 1024,
49
+ "qk_head_dim": 128,
50
+ "qk_nope_head_dim": 64,
51
+ "qk_rope_head_dim": 64,
52
+ "rms_norm_eps": 1e-06,
53
+ "rope_interleave": true,
54
+ "rope_parameters": {
55
+ "beta_fast": 32.0,
56
+ "beta_slow": 1.0,
57
+ "factor": 128.0,
58
+ "llama_4_scaling_beta": 0.1,
59
+ "mscale": 1.0,
60
+ "mscale_all_dim": 1.0,
61
+ "original_max_position_embeddings": 8192,
62
+ "rope_theta": 10000.0,
63
+ "rope_type": "yarn",
64
+ "type": "yarn"
65
+ },
66
+ "routed_scaling_factor": 1.0,
67
+ "sliding_window": null,
68
+ "tie_word_embeddings": false,
69
+ "topk_group": 1,
70
+ "use_cache": true,
71
+ "v_head_dim": 128,
72
+ "vocab_size": 131072
73
+ },
74
+ "tie_word_embeddings": false,
75
+ "transformers_version": "5.3.0.dev0",
76
+ "vision_config": {
77
+ "attention_dropout": 0.0,
78
+ "head_dim": 64,
79
+ "hidden_act": "silu",
80
+ "hidden_size": 1024,
81
+ "image_size": 1540,
82
+ "initializer_range": 0.02,
83
+ "intermediate_size": 4096,
84
+ "model_type": "pixtral",
85
+ "num_attention_heads": 16,
86
+ "num_channels": 3,
87
+ "num_hidden_layers": 24,
88
+ "patch_size": 14,
89
+ "rope_parameters": {
90
+ "rope_theta": 10000.0,
91
+ "rope_type": "default"
92
+ }
93
+ },
94
+ "vision_feature_layer": -1
95
+ }
consolidated-00001-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e493a53149d9fb05e48d5e814e4e40f97a605d32c0d579d15ebc60684af51ae6
3
+ size 20000632534
consolidated-00002-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82bf13d11290a0d3ee344e5a07224c623ae5f9c473fe774e24dfccb3506fe7b9
3
+ size 19997737172
consolidated-00003-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1410d0d2223d665800f7d5b05d7d8adf2df4fb9c7e2e71f3b272f5a5bf07f1e6
3
+ size 19997736948
consolidated-00004-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8725562ee8c0ee67056fe128220f201456abad6fa2039c61bc7421de8baa4dbd
3
+ size 19997737108
consolidated-00005-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee67e9f47175ac89750790fcc5751a497ecedfb424779a66d9d5e7fcab468364
3
+ size 19997738276
consolidated-00006-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:793e1be0042f616a50d73f083e47aadaf98ab65e9c909b8986a81e6611565428
3
+ size 19861887138
consolidated-00007-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55ec7d746e71207f12b5303c65377b84612859bc78a0cceefe9ccfc60cc7e6de
3
+ size 1073741920
consolidated.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 1,
3
+ "eos_token_id": 2,
4
+ "max_length": 1048576,
5
+ "pad_token_id": 11,
6
+ "transformers_version": "5.3.0.dev0"
7
+ }
images/aime.png ADDED

Git LFS Details

  • SHA256: da1ee41ac7cebc824d1e732bef13853ac855dacf08b8c5811fd201f7c7cb7180
  • Pointer size: 131 Bytes
  • Size of remote file: 601 kB
images/image2.png ADDED

Git LFS Details

  • SHA256: abde364d45a8f25096dbe279b15db1acd1d4ea6f74fe858281266d5748ffd377
  • Pointer size: 131 Bytes
  • Size of remote file: 152 kB
images/image3.png ADDED

Git LFS Details

  • SHA256: b38c8a96b124ff831fa9e7270939123565b396485b48f3219f0387142661b735
  • Pointer size: 131 Bytes
  • Size of remote file: 220 kB
images/lcr.png ADDED

Git LFS Details

  • SHA256: bbbe6dd105653d399b6641ab2093502e4d0d9dddf485480e18b93c67c1bf1bba
  • Pointer size: 131 Bytes
  • Size of remote file: 586 kB
images/livecode.png ADDED

Git LFS Details

  • SHA256: f7e7c298ddfd1730e14aa79983bf94c6cfac332558561f083e85787629ecaea3
  • Pointer size: 131 Bytes
  • Size of remote file: 571 kB
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:811a33f74a5160806bf59a5134f015a86386eea49d1b8d2f51c6509382cf5f03
3
+ size 49078433296
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50f010b6318591e49c41dda94926dfaa436f17219ec79f27289a72dd93e1effa
3
+ size 49132710896
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:166755c02c0c0682302d460c0eadcf138758a022d9a578c7eb74f3251e770999
3
+ size 22711829744
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
params.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dim": 4096,
3
+ "n_layers": 36,
4
+ "head_dim": 128,
5
+ "hidden_dim": 12288,
6
+ "n_heads": 32,
7
+ "n_kv_heads": 32,
8
+ "rope_theta": 10000.0,
9
+ "norm_eps": 1e-06,
10
+ "vocab_size": 131072,
11
+ "tied_embeddings": false,
12
+ "max_position_embeddings": 1048576,
13
+ "llama_4_scaling": {
14
+ "original_max_position_embeddings": 8192,
15
+ "beta": 0.1
16
+ },
17
+ "q_lora_rank": 1024,
18
+ "qk_rope_head_dim": 64,
19
+ "qk_nope_head_dim": 64,
20
+ "kv_lora_rank": 256,
21
+ "v_head_dim": 128,
22
+ "quantization": {
23
+ "qformat_weight": "fp8_e4m3",
24
+ "qscheme_act": "TENSOR"
25
+ },
26
+ "yarn": {
27
+ "original_max_position_embeddings": 8192,
28
+ "factor": 128,
29
+ "apply_scale": false,
30
+ "beta": 32,
31
+ "alpha": 1
32
+ },
33
+ "moe": {
34
+ "expert_parallel": 1,
35
+ "expert_model_parallel": 1,
36
+ "route_every_n": 1,
37
+ "first_k_dense_replace": 0,
38
+ "num_experts": 128,
39
+ "num_experts_per_tok": 4,
40
+ "num_expert_groups": 1,
41
+ "num_expert_groups_per_tok": 1,
42
+ "routed_scale": 1.0,
43
+ "expert_hidden_dim": 2048,
44
+ "num_shared_experts": 1
45
+ },
46
+ "vision_encoder": {
47
+ "image_token_id": 10,
48
+ "image_break_token_id": 12,
49
+ "image_end_token_id": 13,
50
+ "intermediate_size": 4096,
51
+ "num_hidden_layers": 24,
52
+ "num_attention_heads": 16,
53
+ "mm_projector_id": "patch_merge",
54
+ "spatial_merge_size": 2,
55
+ "hidden_size": 1024,
56
+ "num_channels": 3,
57
+ "image_size": 1540,
58
+ "max_image_size": 1540,
59
+ "patch_size": 14,
60
+ "rope_theta": 10000.0,
61
+ "add_pre_mm_projector_layer_norm": true,
62
+ "adapter_bias": false
63
+ }
64
+ }
processor_config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "image_break_token": "[IMG_BREAK]",
3
+ "image_end_token": "[IMG_END]",
4
+ "image_processor": {
5
+ "data_format": "channels_first",
6
+ "do_convert_rgb": true,
7
+ "do_normalize": true,
8
+ "do_rescale": true,
9
+ "do_resize": true,
10
+ "image_mean": [
11
+ 0.48145466,
12
+ 0.4578275,
13
+ 0.40821073
14
+ ],
15
+ "image_processor_type": "PixtralImageProcessorFast",
16
+ "image_std": [
17
+ 0.26862954,
18
+ 0.26130258,
19
+ 0.27577711
20
+ ],
21
+ "patch_size": 14,
22
+ "resample": 3,
23
+ "rescale_factor": 0.00392156862745098,
24
+ "size": {
25
+ "longest_edge": 1540
26
+ }
27
+ },
28
+ "image_token": "[IMG]",
29
+ "patch_size": 14,
30
+ "processor_class": "PixtralProcessor",
31
+ "spatial_merge_size": 2
32
+ }
tekken.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1272b956bd6edd2d2c674c76896c7661308c9e723997b0afb55ecb429cb5dc7
3
+ size 16275354
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ba5b3330fd84d5376fcca797cfb3b42eee6241ce23e3271e6fb2a115a8751bd
3
+ size 17077420
tokenizer_config.json ADDED
@@ -0,0 +1,1012 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "bos_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "extra_special_tokens": [
6
+ "<unk>",
7
+ "<s>",
8
+ "</s>",
9
+ "[INST]",
10
+ "[/INST]",
11
+ "[AVAILABLE_TOOLS]",
12
+ "[/AVAILABLE_TOOLS]",
13
+ "[TOOL_RESULTS]",
14
+ "[/TOOL_RESULTS]",
15
+ "[TOOL_CALLS]",
16
+ "[IMG]",
17
+ "<pad>",
18
+ "[IMG_BREAK]",
19
+ "[IMG_END]",
20
+ "[PREFIX]",
21
+ "[MIDDLE]",
22
+ "[SUFFIX]",
23
+ "[SYSTEM_PROMPT]",
24
+ "[/SYSTEM_PROMPT]",
25
+ "[TOOL_CONTENT]",
26
+ "<SPECIAL_20>",
27
+ "<SPECIAL_21>",
28
+ "<SPECIAL_22>",
29
+ "<SPECIAL_23>",
30
+ "[AUDIO]",
31
+ "[BEGIN_AUDIO]",
32
+ "<SPECIAL_26>",
33
+ "<SPECIAL_27>",
34
+ "<SPECIAL_28>",
35
+ "<SPECIAL_29>",
36
+ "<SPECIAL_30>",
37
+ "<SPECIAL_31>",
38
+ "[ARGS]",
39
+ "[CALL_ID]",
40
+ "[THINK]",
41
+ "[/THINK]",
42
+ "[MODEL_SETTINGS]",
43
+ "[/MODEL_SETTINGS]",
44
+ "<SPECIAL_38>",
45
+ "<SPECIAL_39>",
46
+ "<SPECIAL_40>",
47
+ "<SPECIAL_41>",
48
+ "<SPECIAL_42>",
49
+ "<SPECIAL_43>",
50
+ "<SPECIAL_44>",
51
+ "<SPECIAL_45>",
52
+ "<SPECIAL_46>",
53
+ "<SPECIAL_47>",
54
+ "<SPECIAL_48>",
55
+ "<SPECIAL_49>",
56
+ "<SPECIAL_50>",
57
+ "<SPECIAL_51>",
58
+ "<SPECIAL_52>",
59
+ "<SPECIAL_53>",
60
+ "<SPECIAL_54>",
61
+ "<SPECIAL_55>",
62
+ "<SPECIAL_56>",
63
+ "<SPECIAL_57>",
64
+ "<SPECIAL_58>",
65
+ "<SPECIAL_59>",
66
+ "<SPECIAL_60>",
67
+ "<SPECIAL_61>",
68
+ "<SPECIAL_62>",
69
+ "<SPECIAL_63>",
70
+ "<SPECIAL_64>",
71
+ "<SPECIAL_65>",
72
+ "<SPECIAL_66>",
73
+ "<SPECIAL_67>",
74
+ "<SPECIAL_68>",
75
+ "<SPECIAL_69>",
76
+ "<SPECIAL_70>",
77
+ "<SPECIAL_71>",
78
+ "<SPECIAL_72>",
79
+ "<SPECIAL_73>",
80
+ "<SPECIAL_74>",
81
+ "<SPECIAL_75>",
82
+ "<SPECIAL_76>",
83
+ "<SPECIAL_77>",
84
+ "<SPECIAL_78>",
85
+ "<SPECIAL_79>",
86
+ "<SPECIAL_80>",
87
+ "<SPECIAL_81>",
88
+ "<SPECIAL_82>",
89
+ "<SPECIAL_83>",
90
+ "<SPECIAL_84>",
91
+ "<SPECIAL_85>",
92
+ "<SPECIAL_86>",
93
+ "<SPECIAL_87>",
94
+ "<SPECIAL_88>",
95
+ "<SPECIAL_89>",
96
+ "<SPECIAL_90>",
97
+ "<SPECIAL_91>",
98
+ "<SPECIAL_92>",
99
+ "<SPECIAL_93>",
100
+ "<SPECIAL_94>",
101
+ "<SPECIAL_95>",
102
+ "<SPECIAL_96>",
103
+ "<SPECIAL_97>",
104
+ "<SPECIAL_98>",
105
+ "<SPECIAL_99>",
106
+ "<SPECIAL_100>",
107
+ "<SPECIAL_101>",
108
+ "<SPECIAL_102>",
109
+ "<SPECIAL_103>",
110
+ "<SPECIAL_104>",
111
+ "<SPECIAL_105>",
112
+ "<SPECIAL_106>",
113
+ "<SPECIAL_107>",
114
+ "<SPECIAL_108>",
115
+ "<SPECIAL_109>",
116
+ "<SPECIAL_110>",
117
+ "<SPECIAL_111>",
118
+ "<SPECIAL_112>",
119
+ "<SPECIAL_113>",
120
+ "<SPECIAL_114>",
121
+ "<SPECIAL_115>",
122
+ "<SPECIAL_116>",
123
+ "<SPECIAL_117>",
124
+ "<SPECIAL_118>",
125
+ "<SPECIAL_119>",
126
+ "<SPECIAL_120>",
127
+ "<SPECIAL_121>",
128
+ "<SPECIAL_122>",
129
+ "<SPECIAL_123>",
130
+ "<SPECIAL_124>",
131
+ "<SPECIAL_125>",
132
+ "<SPECIAL_126>",
133
+ "<SPECIAL_127>",
134
+ "<SPECIAL_128>",
135
+ "<SPECIAL_129>",
136
+ "<SPECIAL_130>",
137
+ "<SPECIAL_131>",
138
+ "<SPECIAL_132>",
139
+ "<SPECIAL_133>",
140
+ "<SPECIAL_134>",
141
+ "<SPECIAL_135>",
142
+ "<SPECIAL_136>",
143
+ "<SPECIAL_137>",
144
+ "<SPECIAL_138>",
145
+ "<SPECIAL_139>",
146
+ "<SPECIAL_140>",
147
+ "<SPECIAL_141>",
148
+ "<SPECIAL_142>",
149
+ "<SPECIAL_143>",
150
+ "<SPECIAL_144>",
151
+ "<SPECIAL_145>",
152
+ "<SPECIAL_146>",
153
+ "<SPECIAL_147>",
154
+ "<SPECIAL_148>",
155
+ "<SPECIAL_149>",
156
+ "<SPECIAL_150>",
157
+ "<SPECIAL_151>",
158
+ "<SPECIAL_152>",
159
+ "<SPECIAL_153>",
160
+ "<SPECIAL_154>",
161
+ "<SPECIAL_155>",
162
+ "<SPECIAL_156>",
163
+ "<SPECIAL_157>",
164
+ "<SPECIAL_158>",
165
+ "<SPECIAL_159>",
166
+ "<SPECIAL_160>",
167
+ "<SPECIAL_161>",
168
+ "<SPECIAL_162>",
169
+ "<SPECIAL_163>",
170
+ "<SPECIAL_164>",
171
+ "<SPECIAL_165>",
172
+ "<SPECIAL_166>",
173
+ "<SPECIAL_167>",
174
+ "<SPECIAL_168>",
175
+ "<SPECIAL_169>",
176
+ "<SPECIAL_170>",
177
+ "<SPECIAL_171>",
178
+ "<SPECIAL_172>",
179
+ "<SPECIAL_173>",
180
+ "<SPECIAL_174>",
181
+ "<SPECIAL_175>",
182
+ "<SPECIAL_176>",
183
+ "<SPECIAL_177>",
184
+ "<SPECIAL_178>",
185
+ "<SPECIAL_179>",
186
+ "<SPECIAL_180>",
187
+ "<SPECIAL_181>",
188
+ "<SPECIAL_182>",
189
+ "<SPECIAL_183>",
190
+ "<SPECIAL_184>",
191
+ "<SPECIAL_185>",
192
+ "<SPECIAL_186>",
193
+ "<SPECIAL_187>",
194
+ "<SPECIAL_188>",
195
+ "<SPECIAL_189>",
196
+ "<SPECIAL_190>",
197
+ "<SPECIAL_191>",
198
+ "<SPECIAL_192>",
199
+ "<SPECIAL_193>",
200
+ "<SPECIAL_194>",
201
+ "<SPECIAL_195>",
202
+ "<SPECIAL_196>",
203
+ "<SPECIAL_197>",
204
+ "<SPECIAL_198>",
205
+ "<SPECIAL_199>",
206
+ "<SPECIAL_200>",
207
+ "<SPECIAL_201>",
208
+ "<SPECIAL_202>",
209
+ "<SPECIAL_203>",
210
+ "<SPECIAL_204>",
211
+ "<SPECIAL_205>",
212
+ "<SPECIAL_206>",
213
+ "<SPECIAL_207>",
214
+ "<SPECIAL_208>",
215
+ "<SPECIAL_209>",
216
+ "<SPECIAL_210>",
217
+ "<SPECIAL_211>",
218
+ "<SPECIAL_212>",
219
+ "<SPECIAL_213>",
220
+ "<SPECIAL_214>",
221
+ "<SPECIAL_215>",
222
+ "<SPECIAL_216>",
223
+ "<SPECIAL_217>",
224
+ "<SPECIAL_218>",
225
+ "<SPECIAL_219>",
226
+ "<SPECIAL_220>",
227
+ "<SPECIAL_221>",
228
+ "<SPECIAL_222>",
229
+ "<SPECIAL_223>",
230
+ "<SPECIAL_224>",
231
+ "<SPECIAL_225>",
232
+ "<SPECIAL_226>",
233
+ "<SPECIAL_227>",
234
+ "<SPECIAL_228>",
235
+ "<SPECIAL_229>",
236
+ "<SPECIAL_230>",
237
+ "<SPECIAL_231>",
238
+ "<SPECIAL_232>",
239
+ "<SPECIAL_233>",
240
+ "<SPECIAL_234>",
241
+ "<SPECIAL_235>",
242
+ "<SPECIAL_236>",
243
+ "<SPECIAL_237>",
244
+ "<SPECIAL_238>",
245
+ "<SPECIAL_239>",
246
+ "<SPECIAL_240>",
247
+ "<SPECIAL_241>",
248
+ "<SPECIAL_242>",
249
+ "<SPECIAL_243>",
250
+ "<SPECIAL_244>",
251
+ "<SPECIAL_245>",
252
+ "<SPECIAL_246>",
253
+ "<SPECIAL_247>",
254
+ "<SPECIAL_248>",
255
+ "<SPECIAL_249>",
256
+ "<SPECIAL_250>",
257
+ "<SPECIAL_251>",
258
+ "<SPECIAL_252>",
259
+ "<SPECIAL_253>",
260
+ "<SPECIAL_254>",
261
+ "<SPECIAL_255>",
262
+ "<SPECIAL_256>",
263
+ "<SPECIAL_257>",
264
+ "<SPECIAL_258>",
265
+ "<SPECIAL_259>",
266
+ "<SPECIAL_260>",
267
+ "<SPECIAL_261>",
268
+ "<SPECIAL_262>",
269
+ "<SPECIAL_263>",
270
+ "<SPECIAL_264>",
271
+ "<SPECIAL_265>",
272
+ "<SPECIAL_266>",
273
+ "<SPECIAL_267>",
274
+ "<SPECIAL_268>",
275
+ "<SPECIAL_269>",
276
+ "<SPECIAL_270>",
277
+ "<SPECIAL_271>",
278
+ "<SPECIAL_272>",
279
+ "<SPECIAL_273>",
280
+ "<SPECIAL_274>",
281
+ "<SPECIAL_275>",
282
+ "<SPECIAL_276>",
283
+ "<SPECIAL_277>",
284
+ "<SPECIAL_278>",
285
+ "<SPECIAL_279>",
286
+ "<SPECIAL_280>",
287
+ "<SPECIAL_281>",
288
+ "<SPECIAL_282>",
289
+ "<SPECIAL_283>",
290
+ "<SPECIAL_284>",
291
+ "<SPECIAL_285>",
292
+ "<SPECIAL_286>",
293
+ "<SPECIAL_287>",
294
+ "<SPECIAL_288>",
295
+ "<SPECIAL_289>",
296
+ "<SPECIAL_290>",
297
+ "<SPECIAL_291>",
298
+ "<SPECIAL_292>",
299
+ "<SPECIAL_293>",
300
+ "<SPECIAL_294>",
301
+ "<SPECIAL_295>",
302
+ "<SPECIAL_296>",
303
+ "<SPECIAL_297>",
304
+ "<SPECIAL_298>",
305
+ "<SPECIAL_299>",
306
+ "<SPECIAL_300>",
307
+ "<SPECIAL_301>",
308
+ "<SPECIAL_302>",
309
+ "<SPECIAL_303>",
310
+ "<SPECIAL_304>",
311
+ "<SPECIAL_305>",
312
+ "<SPECIAL_306>",
313
+ "<SPECIAL_307>",
314
+ "<SPECIAL_308>",
315
+ "<SPECIAL_309>",
316
+ "<SPECIAL_310>",
317
+ "<SPECIAL_311>",
318
+ "<SPECIAL_312>",
319
+ "<SPECIAL_313>",
320
+ "<SPECIAL_314>",
321
+ "<SPECIAL_315>",
322
+ "<SPECIAL_316>",
323
+ "<SPECIAL_317>",
324
+ "<SPECIAL_318>",
325
+ "<SPECIAL_319>",
326
+ "<SPECIAL_320>",
327
+ "<SPECIAL_321>",
328
+ "<SPECIAL_322>",
329
+ "<SPECIAL_323>",
330
+ "<SPECIAL_324>",
331
+ "<SPECIAL_325>",
332
+ "<SPECIAL_326>",
333
+ "<SPECIAL_327>",
334
+ "<SPECIAL_328>",
335
+ "<SPECIAL_329>",
336
+ "<SPECIAL_330>",
337
+ "<SPECIAL_331>",
338
+ "<SPECIAL_332>",
339
+ "<SPECIAL_333>",
340
+ "<SPECIAL_334>",
341
+ "<SPECIAL_335>",
342
+ "<SPECIAL_336>",
343
+ "<SPECIAL_337>",
344
+ "<SPECIAL_338>",
345
+ "<SPECIAL_339>",
346
+ "<SPECIAL_340>",
347
+ "<SPECIAL_341>",
348
+ "<SPECIAL_342>",
349
+ "<SPECIAL_343>",
350
+ "<SPECIAL_344>",
351
+ "<SPECIAL_345>",
352
+ "<SPECIAL_346>",
353
+ "<SPECIAL_347>",
354
+ "<SPECIAL_348>",
355
+ "<SPECIAL_349>",
356
+ "<SPECIAL_350>",
357
+ "<SPECIAL_351>",
358
+ "<SPECIAL_352>",
359
+ "<SPECIAL_353>",
360
+ "<SPECIAL_354>",
361
+ "<SPECIAL_355>",
362
+ "<SPECIAL_356>",
363
+ "<SPECIAL_357>",
364
+ "<SPECIAL_358>",
365
+ "<SPECIAL_359>",
366
+ "<SPECIAL_360>",
367
+ "<SPECIAL_361>",
368
+ "<SPECIAL_362>",
369
+ "<SPECIAL_363>",
370
+ "<SPECIAL_364>",
371
+ "<SPECIAL_365>",
372
+ "<SPECIAL_366>",
373
+ "<SPECIAL_367>",
374
+ "<SPECIAL_368>",
375
+ "<SPECIAL_369>",
376
+ "<SPECIAL_370>",
377
+ "<SPECIAL_371>",
378
+ "<SPECIAL_372>",
379
+ "<SPECIAL_373>",
380
+ "<SPECIAL_374>",
381
+ "<SPECIAL_375>",
382
+ "<SPECIAL_376>",
383
+ "<SPECIAL_377>",
384
+ "<SPECIAL_378>",
385
+ "<SPECIAL_379>",
386
+ "<SPECIAL_380>",
387
+ "<SPECIAL_381>",
388
+ "<SPECIAL_382>",
389
+ "<SPECIAL_383>",
390
+ "<SPECIAL_384>",
391
+ "<SPECIAL_385>",
392
+ "<SPECIAL_386>",
393
+ "<SPECIAL_387>",
394
+ "<SPECIAL_388>",
395
+ "<SPECIAL_389>",
396
+ "<SPECIAL_390>",
397
+ "<SPECIAL_391>",
398
+ "<SPECIAL_392>",
399
+ "<SPECIAL_393>",
400
+ "<SPECIAL_394>",
401
+ "<SPECIAL_395>",
402
+ "<SPECIAL_396>",
403
+ "<SPECIAL_397>",
404
+ "<SPECIAL_398>",
405
+ "<SPECIAL_399>",
406
+ "<SPECIAL_400>",
407
+ "<SPECIAL_401>",
408
+ "<SPECIAL_402>",
409
+ "<SPECIAL_403>",
410
+ "<SPECIAL_404>",
411
+ "<SPECIAL_405>",
412
+ "<SPECIAL_406>",
413
+ "<SPECIAL_407>",
414
+ "<SPECIAL_408>",
415
+ "<SPECIAL_409>",
416
+ "<SPECIAL_410>",
417
+ "<SPECIAL_411>",
418
+ "<SPECIAL_412>",
419
+ "<SPECIAL_413>",
420
+ "<SPECIAL_414>",
421
+ "<SPECIAL_415>",
422
+ "<SPECIAL_416>",
423
+ "<SPECIAL_417>",
424
+ "<SPECIAL_418>",
425
+ "<SPECIAL_419>",
426
+ "<SPECIAL_420>",
427
+ "<SPECIAL_421>",
428
+ "<SPECIAL_422>",
429
+ "<SPECIAL_423>",
430
+ "<SPECIAL_424>",
431
+ "<SPECIAL_425>",
432
+ "<SPECIAL_426>",
433
+ "<SPECIAL_427>",
434
+ "<SPECIAL_428>",
435
+ "<SPECIAL_429>",
436
+ "<SPECIAL_430>",
437
+ "<SPECIAL_431>",
438
+ "<SPECIAL_432>",
439
+ "<SPECIAL_433>",
440
+ "<SPECIAL_434>",
441
+ "<SPECIAL_435>",
442
+ "<SPECIAL_436>",
443
+ "<SPECIAL_437>",
444
+ "<SPECIAL_438>",
445
+ "<SPECIAL_439>",
446
+ "<SPECIAL_440>",
447
+ "<SPECIAL_441>",
448
+ "<SPECIAL_442>",
449
+ "<SPECIAL_443>",
450
+ "<SPECIAL_444>",
451
+ "<SPECIAL_445>",
452
+ "<SPECIAL_446>",
453
+ "<SPECIAL_447>",
454
+ "<SPECIAL_448>",
455
+ "<SPECIAL_449>",
456
+ "<SPECIAL_450>",
457
+ "<SPECIAL_451>",
458
+ "<SPECIAL_452>",
459
+ "<SPECIAL_453>",
460
+ "<SPECIAL_454>",
461
+ "<SPECIAL_455>",
462
+ "<SPECIAL_456>",
463
+ "<SPECIAL_457>",
464
+ "<SPECIAL_458>",
465
+ "<SPECIAL_459>",
466
+ "<SPECIAL_460>",
467
+ "<SPECIAL_461>",
468
+ "<SPECIAL_462>",
469
+ "<SPECIAL_463>",
470
+ "<SPECIAL_464>",
471
+ "<SPECIAL_465>",
472
+ "<SPECIAL_466>",
473
+ "<SPECIAL_467>",
474
+ "<SPECIAL_468>",
475
+ "<SPECIAL_469>",
476
+ "<SPECIAL_470>",
477
+ "<SPECIAL_471>",
478
+ "<SPECIAL_472>",
479
+ "<SPECIAL_473>",
480
+ "<SPECIAL_474>",
481
+ "<SPECIAL_475>",
482
+ "<SPECIAL_476>",
483
+ "<SPECIAL_477>",
484
+ "<SPECIAL_478>",
485
+ "<SPECIAL_479>",
486
+ "<SPECIAL_480>",
487
+ "<SPECIAL_481>",
488
+ "<SPECIAL_482>",
489
+ "<SPECIAL_483>",
490
+ "<SPECIAL_484>",
491
+ "<SPECIAL_485>",
492
+ "<SPECIAL_486>",
493
+ "<SPECIAL_487>",
494
+ "<SPECIAL_488>",
495
+ "<SPECIAL_489>",
496
+ "<SPECIAL_490>",
497
+ "<SPECIAL_491>",
498
+ "<SPECIAL_492>",
499
+ "<SPECIAL_493>",
500
+ "<SPECIAL_494>",
501
+ "<SPECIAL_495>",
502
+ "<SPECIAL_496>",
503
+ "<SPECIAL_497>",
504
+ "<SPECIAL_498>",
505
+ "<SPECIAL_499>",
506
+ "<SPECIAL_500>",
507
+ "<SPECIAL_501>",
508
+ "<SPECIAL_502>",
509
+ "<SPECIAL_503>",
510
+ "<SPECIAL_504>",
511
+ "<SPECIAL_505>",
512
+ "<SPECIAL_506>",
513
+ "<SPECIAL_507>",
514
+ "<SPECIAL_508>",
515
+ "<SPECIAL_509>",
516
+ "<SPECIAL_510>",
517
+ "<SPECIAL_511>",
518
+ "<SPECIAL_512>",
519
+ "<SPECIAL_513>",
520
+ "<SPECIAL_514>",
521
+ "<SPECIAL_515>",
522
+ "<SPECIAL_516>",
523
+ "<SPECIAL_517>",
524
+ "<SPECIAL_518>",
525
+ "<SPECIAL_519>",
526
+ "<SPECIAL_520>",
527
+ "<SPECIAL_521>",
528
+ "<SPECIAL_522>",
529
+ "<SPECIAL_523>",
530
+ "<SPECIAL_524>",
531
+ "<SPECIAL_525>",
532
+ "<SPECIAL_526>",
533
+ "<SPECIAL_527>",
534
+ "<SPECIAL_528>",
535
+ "<SPECIAL_529>",
536
+ "<SPECIAL_530>",
537
+ "<SPECIAL_531>",
538
+ "<SPECIAL_532>",
539
+ "<SPECIAL_533>",
540
+ "<SPECIAL_534>",
541
+ "<SPECIAL_535>",
542
+ "<SPECIAL_536>",
543
+ "<SPECIAL_537>",
544
+ "<SPECIAL_538>",
545
+ "<SPECIAL_539>",
546
+ "<SPECIAL_540>",
547
+ "<SPECIAL_541>",
548
+ "<SPECIAL_542>",
549
+ "<SPECIAL_543>",
550
+ "<SPECIAL_544>",
551
+ "<SPECIAL_545>",
552
+ "<SPECIAL_546>",
553
+ "<SPECIAL_547>",
554
+ "<SPECIAL_548>",
555
+ "<SPECIAL_549>",
556
+ "<SPECIAL_550>",
557
+ "<SPECIAL_551>",
558
+ "<SPECIAL_552>",
559
+ "<SPECIAL_553>",
560
+ "<SPECIAL_554>",
561
+ "<SPECIAL_555>",
562
+ "<SPECIAL_556>",
563
+ "<SPECIAL_557>",
564
+ "<SPECIAL_558>",
565
+ "<SPECIAL_559>",
566
+ "<SPECIAL_560>",
567
+ "<SPECIAL_561>",
568
+ "<SPECIAL_562>",
569
+ "<SPECIAL_563>",
570
+ "<SPECIAL_564>",
571
+ "<SPECIAL_565>",
572
+ "<SPECIAL_566>",
573
+ "<SPECIAL_567>",
574
+ "<SPECIAL_568>",
575
+ "<SPECIAL_569>",
576
+ "<SPECIAL_570>",
577
+ "<SPECIAL_571>",
578
+ "<SPECIAL_572>",
579
+ "<SPECIAL_573>",
580
+ "<SPECIAL_574>",
581
+ "<SPECIAL_575>",
582
+ "<SPECIAL_576>",
583
+ "<SPECIAL_577>",
584
+ "<SPECIAL_578>",
585
+ "<SPECIAL_579>",
586
+ "<SPECIAL_580>",
587
+ "<SPECIAL_581>",
588
+ "<SPECIAL_582>",
589
+ "<SPECIAL_583>",
590
+ "<SPECIAL_584>",
591
+ "<SPECIAL_585>",
592
+ "<SPECIAL_586>",
593
+ "<SPECIAL_587>",
594
+ "<SPECIAL_588>",
595
+ "<SPECIAL_589>",
596
+ "<SPECIAL_590>",
597
+ "<SPECIAL_591>",
598
+ "<SPECIAL_592>",
599
+ "<SPECIAL_593>",
600
+ "<SPECIAL_594>",
601
+ "<SPECIAL_595>",
602
+ "<SPECIAL_596>",
603
+ "<SPECIAL_597>",
604
+ "<SPECIAL_598>",
605
+ "<SPECIAL_599>",
606
+ "<SPECIAL_600>",
607
+ "<SPECIAL_601>",
608
+ "<SPECIAL_602>",
609
+ "<SPECIAL_603>",
610
+ "<SPECIAL_604>",
611
+ "<SPECIAL_605>",
612
+ "<SPECIAL_606>",
613
+ "<SPECIAL_607>",
614
+ "<SPECIAL_608>",
615
+ "<SPECIAL_609>",
616
+ "<SPECIAL_610>",
617
+ "<SPECIAL_611>",
618
+ "<SPECIAL_612>",
619
+ "<SPECIAL_613>",
620
+ "<SPECIAL_614>",
621
+ "<SPECIAL_615>",
622
+ "<SPECIAL_616>",
623
+ "<SPECIAL_617>",
624
+ "<SPECIAL_618>",
625
+ "<SPECIAL_619>",
626
+ "<SPECIAL_620>",
627
+ "<SPECIAL_621>",
628
+ "<SPECIAL_622>",
629
+ "<SPECIAL_623>",
630
+ "<SPECIAL_624>",
631
+ "<SPECIAL_625>",
632
+ "<SPECIAL_626>",
633
+ "<SPECIAL_627>",
634
+ "<SPECIAL_628>",
635
+ "<SPECIAL_629>",
636
+ "<SPECIAL_630>",
637
+ "<SPECIAL_631>",
638
+ "<SPECIAL_632>",
639
+ "<SPECIAL_633>",
640
+ "<SPECIAL_634>",
641
+ "<SPECIAL_635>",
642
+ "<SPECIAL_636>",
643
+ "<SPECIAL_637>",
644
+ "<SPECIAL_638>",
645
+ "<SPECIAL_639>",
646
+ "<SPECIAL_640>",
647
+ "<SPECIAL_641>",
648
+ "<SPECIAL_642>",
649
+ "<SPECIAL_643>",
650
+ "<SPECIAL_644>",
651
+ "<SPECIAL_645>",
652
+ "<SPECIAL_646>",
653
+ "<SPECIAL_647>",
654
+ "<SPECIAL_648>",
655
+ "<SPECIAL_649>",
656
+ "<SPECIAL_650>",
657
+ "<SPECIAL_651>",
658
+ "<SPECIAL_652>",
659
+ "<SPECIAL_653>",
660
+ "<SPECIAL_654>",
661
+ "<SPECIAL_655>",
662
+ "<SPECIAL_656>",
663
+ "<SPECIAL_657>",
664
+ "<SPECIAL_658>",
665
+ "<SPECIAL_659>",
666
+ "<SPECIAL_660>",
667
+ "<SPECIAL_661>",
668
+ "<SPECIAL_662>",
669
+ "<SPECIAL_663>",
670
+ "<SPECIAL_664>",
671
+ "<SPECIAL_665>",
672
+ "<SPECIAL_666>",
673
+ "<SPECIAL_667>",
674
+ "<SPECIAL_668>",
675
+ "<SPECIAL_669>",
676
+ "<SPECIAL_670>",
677
+ "<SPECIAL_671>",
678
+ "<SPECIAL_672>",
679
+ "<SPECIAL_673>",
680
+ "<SPECIAL_674>",
681
+ "<SPECIAL_675>",
682
+ "<SPECIAL_676>",
683
+ "<SPECIAL_677>",
684
+ "<SPECIAL_678>",
685
+ "<SPECIAL_679>",
686
+ "<SPECIAL_680>",
687
+ "<SPECIAL_681>",
688
+ "<SPECIAL_682>",
689
+ "<SPECIAL_683>",
690
+ "<SPECIAL_684>",
691
+ "<SPECIAL_685>",
692
+ "<SPECIAL_686>",
693
+ "<SPECIAL_687>",
694
+ "<SPECIAL_688>",
695
+ "<SPECIAL_689>",
696
+ "<SPECIAL_690>",
697
+ "<SPECIAL_691>",
698
+ "<SPECIAL_692>",
699
+ "<SPECIAL_693>",
700
+ "<SPECIAL_694>",
701
+ "<SPECIAL_695>",
702
+ "<SPECIAL_696>",
703
+ "<SPECIAL_697>",
704
+ "<SPECIAL_698>",
705
+ "<SPECIAL_699>",
706
+ "<SPECIAL_700>",
707
+ "<SPECIAL_701>",
708
+ "<SPECIAL_702>",
709
+ "<SPECIAL_703>",
710
+ "<SPECIAL_704>",
711
+ "<SPECIAL_705>",
712
+ "<SPECIAL_706>",
713
+ "<SPECIAL_707>",
714
+ "<SPECIAL_708>",
715
+ "<SPECIAL_709>",
716
+ "<SPECIAL_710>",
717
+ "<SPECIAL_711>",
718
+ "<SPECIAL_712>",
719
+ "<SPECIAL_713>",
720
+ "<SPECIAL_714>",
721
+ "<SPECIAL_715>",
722
+ "<SPECIAL_716>",
723
+ "<SPECIAL_717>",
724
+ "<SPECIAL_718>",
725
+ "<SPECIAL_719>",
726
+ "<SPECIAL_720>",
727
+ "<SPECIAL_721>",
728
+ "<SPECIAL_722>",
729
+ "<SPECIAL_723>",
730
+ "<SPECIAL_724>",
731
+ "<SPECIAL_725>",
732
+ "<SPECIAL_726>",
733
+ "<SPECIAL_727>",
734
+ "<SPECIAL_728>",
735
+ "<SPECIAL_729>",
736
+ "<SPECIAL_730>",
737
+ "<SPECIAL_731>",
738
+ "<SPECIAL_732>",
739
+ "<SPECIAL_733>",
740
+ "<SPECIAL_734>",
741
+ "<SPECIAL_735>",
742
+ "<SPECIAL_736>",
743
+ "<SPECIAL_737>",
744
+ "<SPECIAL_738>",
745
+ "<SPECIAL_739>",
746
+ "<SPECIAL_740>",
747
+ "<SPECIAL_741>",
748
+ "<SPECIAL_742>",
749
+ "<SPECIAL_743>",
750
+ "<SPECIAL_744>",
751
+ "<SPECIAL_745>",
752
+ "<SPECIAL_746>",
753
+ "<SPECIAL_747>",
754
+ "<SPECIAL_748>",
755
+ "<SPECIAL_749>",
756
+ "<SPECIAL_750>",
757
+ "<SPECIAL_751>",
758
+ "<SPECIAL_752>",
759
+ "<SPECIAL_753>",
760
+ "<SPECIAL_754>",
761
+ "<SPECIAL_755>",
762
+ "<SPECIAL_756>",
763
+ "<SPECIAL_757>",
764
+ "<SPECIAL_758>",
765
+ "<SPECIAL_759>",
766
+ "<SPECIAL_760>",
767
+ "<SPECIAL_761>",
768
+ "<SPECIAL_762>",
769
+ "<SPECIAL_763>",
770
+ "<SPECIAL_764>",
771
+ "<SPECIAL_765>",
772
+ "<SPECIAL_766>",
773
+ "<SPECIAL_767>",
774
+ "<SPECIAL_768>",
775
+ "<SPECIAL_769>",
776
+ "<SPECIAL_770>",
777
+ "<SPECIAL_771>",
778
+ "<SPECIAL_772>",
779
+ "<SPECIAL_773>",
780
+ "<SPECIAL_774>",
781
+ "<SPECIAL_775>",
782
+ "<SPECIAL_776>",
783
+ "<SPECIAL_777>",
784
+ "<SPECIAL_778>",
785
+ "<SPECIAL_779>",
786
+ "<SPECIAL_780>",
787
+ "<SPECIAL_781>",
788
+ "<SPECIAL_782>",
789
+ "<SPECIAL_783>",
790
+ "<SPECIAL_784>",
791
+ "<SPECIAL_785>",
792
+ "<SPECIAL_786>",
793
+ "<SPECIAL_787>",
794
+ "<SPECIAL_788>",
795
+ "<SPECIAL_789>",
796
+ "<SPECIAL_790>",
797
+ "<SPECIAL_791>",
798
+ "<SPECIAL_792>",
799
+ "<SPECIAL_793>",
800
+ "<SPECIAL_794>",
801
+ "<SPECIAL_795>",
802
+ "<SPECIAL_796>",
803
+ "<SPECIAL_797>",
804
+ "<SPECIAL_798>",
805
+ "<SPECIAL_799>",
806
+ "<SPECIAL_800>",
807
+ "<SPECIAL_801>",
808
+ "<SPECIAL_802>",
809
+ "<SPECIAL_803>",
810
+ "<SPECIAL_804>",
811
+ "<SPECIAL_805>",
812
+ "<SPECIAL_806>",
813
+ "<SPECIAL_807>",
814
+ "<SPECIAL_808>",
815
+ "<SPECIAL_809>",
816
+ "<SPECIAL_810>",
817
+ "<SPECIAL_811>",
818
+ "<SPECIAL_812>",
819
+ "<SPECIAL_813>",
820
+ "<SPECIAL_814>",
821
+ "<SPECIAL_815>",
822
+ "<SPECIAL_816>",
823
+ "<SPECIAL_817>",
824
+ "<SPECIAL_818>",
825
+ "<SPECIAL_819>",
826
+ "<SPECIAL_820>",
827
+ "<SPECIAL_821>",
828
+ "<SPECIAL_822>",
829
+ "<SPECIAL_823>",
830
+ "<SPECIAL_824>",
831
+ "<SPECIAL_825>",
832
+ "<SPECIAL_826>",
833
+ "<SPECIAL_827>",
834
+ "<SPECIAL_828>",
835
+ "<SPECIAL_829>",
836
+ "<SPECIAL_830>",
837
+ "<SPECIAL_831>",
838
+ "<SPECIAL_832>",
839
+ "<SPECIAL_833>",
840
+ "<SPECIAL_834>",
841
+ "<SPECIAL_835>",
842
+ "<SPECIAL_836>",
843
+ "<SPECIAL_837>",
844
+ "<SPECIAL_838>",
845
+ "<SPECIAL_839>",
846
+ "<SPECIAL_840>",
847
+ "<SPECIAL_841>",
848
+ "<SPECIAL_842>",
849
+ "<SPECIAL_843>",
850
+ "<SPECIAL_844>",
851
+ "<SPECIAL_845>",
852
+ "<SPECIAL_846>",
853
+ "<SPECIAL_847>",
854
+ "<SPECIAL_848>",
855
+ "<SPECIAL_849>",
856
+ "<SPECIAL_850>",
857
+ "<SPECIAL_851>",
858
+ "<SPECIAL_852>",
859
+ "<SPECIAL_853>",
860
+ "<SPECIAL_854>",
861
+ "<SPECIAL_855>",
862
+ "<SPECIAL_856>",
863
+ "<SPECIAL_857>",
864
+ "<SPECIAL_858>",
865
+ "<SPECIAL_859>",
866
+ "<SPECIAL_860>",
867
+ "<SPECIAL_861>",
868
+ "<SPECIAL_862>",
869
+ "<SPECIAL_863>",
870
+ "<SPECIAL_864>",
871
+ "<SPECIAL_865>",
872
+ "<SPECIAL_866>",
873
+ "<SPECIAL_867>",
874
+ "<SPECIAL_868>",
875
+ "<SPECIAL_869>",
876
+ "<SPECIAL_870>",
877
+ "<SPECIAL_871>",
878
+ "<SPECIAL_872>",
879
+ "<SPECIAL_873>",
880
+ "<SPECIAL_874>",
881
+ "<SPECIAL_875>",
882
+ "<SPECIAL_876>",
883
+ "<SPECIAL_877>",
884
+ "<SPECIAL_878>",
885
+ "<SPECIAL_879>",
886
+ "<SPECIAL_880>",
887
+ "<SPECIAL_881>",
888
+ "<SPECIAL_882>",
889
+ "<SPECIAL_883>",
890
+ "<SPECIAL_884>",
891
+ "<SPECIAL_885>",
892
+ "<SPECIAL_886>",
893
+ "<SPECIAL_887>",
894
+ "<SPECIAL_888>",
895
+ "<SPECIAL_889>",
896
+ "<SPECIAL_890>",
897
+ "<SPECIAL_891>",
898
+ "<SPECIAL_892>",
899
+ "<SPECIAL_893>",
900
+ "<SPECIAL_894>",
901
+ "<SPECIAL_895>",
902
+ "<SPECIAL_896>",
903
+ "<SPECIAL_897>",
904
+ "<SPECIAL_898>",
905
+ "<SPECIAL_899>",
906
+ "<SPECIAL_900>",
907
+ "<SPECIAL_901>",
908
+ "<SPECIAL_902>",
909
+ "<SPECIAL_903>",
910
+ "<SPECIAL_904>",
911
+ "<SPECIAL_905>",
912
+ "<SPECIAL_906>",
913
+ "<SPECIAL_907>",
914
+ "<SPECIAL_908>",
915
+ "<SPECIAL_909>",
916
+ "<SPECIAL_910>",
917
+ "<SPECIAL_911>",
918
+ "<SPECIAL_912>",
919
+ "<SPECIAL_913>",
920
+ "<SPECIAL_914>",
921
+ "<SPECIAL_915>",
922
+ "<SPECIAL_916>",
923
+ "<SPECIAL_917>",
924
+ "<SPECIAL_918>",
925
+ "<SPECIAL_919>",
926
+ "<SPECIAL_920>",
927
+ "<SPECIAL_921>",
928
+ "<SPECIAL_922>",
929
+ "<SPECIAL_923>",
930
+ "<SPECIAL_924>",
931
+ "<SPECIAL_925>",
932
+ "<SPECIAL_926>",
933
+ "<SPECIAL_927>",
934
+ "<SPECIAL_928>",
935
+ "<SPECIAL_929>",
936
+ "<SPECIAL_930>",
937
+ "<SPECIAL_931>",
938
+ "<SPECIAL_932>",
939
+ "<SPECIAL_933>",
940
+ "<SPECIAL_934>",
941
+ "<SPECIAL_935>",
942
+ "<SPECIAL_936>",
943
+ "<SPECIAL_937>",
944
+ "<SPECIAL_938>",
945
+ "<SPECIAL_939>",
946
+ "<SPECIAL_940>",
947
+ "<SPECIAL_941>",
948
+ "<SPECIAL_942>",
949
+ "<SPECIAL_943>",
950
+ "<SPECIAL_944>",
951
+ "<SPECIAL_945>",
952
+ "<SPECIAL_946>",
953
+ "<SPECIAL_947>",
954
+ "<SPECIAL_948>",
955
+ "<SPECIAL_949>",
956
+ "<SPECIAL_950>",
957
+ "<SPECIAL_951>",
958
+ "<SPECIAL_952>",
959
+ "<SPECIAL_953>",
960
+ "<SPECIAL_954>",
961
+ "<SPECIAL_955>",
962
+ "<SPECIAL_956>",
963
+ "<SPECIAL_957>",
964
+ "<SPECIAL_958>",
965
+ "<SPECIAL_959>",
966
+ "<SPECIAL_960>",
967
+ "<SPECIAL_961>",
968
+ "<SPECIAL_962>",
969
+ "<SPECIAL_963>",
970
+ "<SPECIAL_964>",
971
+ "<SPECIAL_965>",
972
+ "<SPECIAL_966>",
973
+ "<SPECIAL_967>",
974
+ "<SPECIAL_968>",
975
+ "<SPECIAL_969>",
976
+ "<SPECIAL_970>",
977
+ "<SPECIAL_971>",
978
+ "<SPECIAL_972>",
979
+ "<SPECIAL_973>",
980
+ "<SPECIAL_974>",
981
+ "<SPECIAL_975>",
982
+ "<SPECIAL_976>",
983
+ "<SPECIAL_977>",
984
+ "<SPECIAL_978>",
985
+ "<SPECIAL_979>",
986
+ "<SPECIAL_980>",
987
+ "<SPECIAL_981>",
988
+ "<SPECIAL_982>",
989
+ "<SPECIAL_983>",
990
+ "<SPECIAL_984>",
991
+ "<SPECIAL_985>",
992
+ "<SPECIAL_986>",
993
+ "<SPECIAL_987>",
994
+ "<SPECIAL_988>",
995
+ "<SPECIAL_989>",
996
+ "<SPECIAL_990>",
997
+ "<SPECIAL_991>",
998
+ "<SPECIAL_992>",
999
+ "<SPECIAL_993>",
1000
+ "<SPECIAL_994>",
1001
+ "<SPECIAL_995>",
1002
+ "<SPECIAL_996>",
1003
+ "<SPECIAL_997>",
1004
+ "<SPECIAL_998>",
1005
+ "<SPECIAL_999>"
1006
+ ],
1007
+ "model_max_length": 1000000000000000019884624838656,
1008
+ "pad_token": "<pad>",
1009
+ "processor_class": "PixtralProcessor",
1010
+ "tokenizer_class": "TokenizersBackend",
1011
+ "unk_token": "<unk>"
1012
+ }