Ba2han commited on
Commit
6e15619
·
verified ·
1 Parent(s): d1b9a0b

Training in progress, step 2074

Browse files
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- base_model: unsloth/Ministral-3-3B-Instruct-2512
3
  library_name: transformers
4
  model_name: m_augment
5
  tags:
@@ -12,7 +11,7 @@ licence: license
12
 
13
  # Model Card for m_augment
14
 
15
- This model is a fine-tuned version of [unsloth/Ministral-3-3B-Instruct-2512](https://huggingface.co/unsloth/Ministral-3-3B-Instruct-2512).
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
@@ -28,7 +27,7 @@ print(output["generated_text"])
28
 
29
  ## Training procedure
30
 
31
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/batuhan409/huggingface/runs/204ezq5u)
32
 
33
 
34
  This model was trained with SFT.
@@ -36,7 +35,7 @@ This model was trained with SFT.
36
  ### Framework versions
37
 
38
  - TRL: 0.24.0
39
- - Transformers: 5.0.0.dev0
40
  - Pytorch: 2.10.0
41
  - Datasets: 4.3.0
42
  - Tokenizers: 0.22.2
 
1
  ---
 
2
  library_name: transformers
3
  model_name: m_augment
4
  tags:
 
11
 
12
  # Model Card for m_augment
13
 
14
+ This model is a fine-tuned version of [None](https://huggingface.co/None).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
27
 
28
  ## Training procedure
29
 
30
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/batuhan409/huggingface/runs/z2vn1gzd)
31
 
32
 
33
  This model was trained with SFT.
 
35
  ### Framework versions
36
 
37
  - TRL: 0.24.0
38
+ - Transformers: 5.0.0rc0
39
  - Pytorch: 2.10.0
40
  - Datasets: 4.3.0
41
  - Tokenizers: 0.22.2
chat_template.jinja CHANGED
@@ -1,121 +1,4 @@
1
- {#- Unsloth template fixes #}
2
- {#- Default system message if no system prompt is passed. #}
3
- {%- set default_system_message = 'You are Ministral-3-3B-Instruct-2512, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\nThe current date is {today}.\n\nWhen you\'re not sure about some information or when the user\'s request requires up-to-date or specific data, you must use the available tools to fetch the information. Do not hesitate to use tools whenever they can provide a more accurate or complete response. If no relevant tools are available, then clearly state that you don\'t have the information and avoid making up anything.\nIf the user\'s question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. "What are some good restaurants around me?" => "Where are you?" or "When is the next flight to Tokyo" => "Where do you travel from?").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. "yesterday" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.\n\n# TOOL CALLING INSTRUCTIONS\n\nYou may have access to tools that you can use to fetch information or perform actions. You must use these tools in the following situations:\n\n1. When the request requires up-to-date information.\n2. When the request requires specific data that you do not have in your knowledge base.\n3. When the request involves actions that you cannot perform without tools.\n\nAlways prioritize using tools to provide the most accurate and helpful response. If tools are not available, inform the user that you cannot perform the requested action at the moment.' %}
4
-
5
- {#- Begin of sequence token. #}
6
- {{- bos_token }}
7
-
8
- {#- Handle system prompt if it exists. #}
9
- {#- System prompt supports text content or text chunks. #}
10
- {%- if messages[0]['role'] == 'system' %}
11
- {{- '[SYSTEM_PROMPT]' -}}
12
- {%- if messages[0]['content'] is string %}
13
- {{- messages[0]['content'] -}}
14
- {%- else %}
15
- {%- for block in messages[0]['content'] %}
16
- {%- if block['type'] == 'text' %}
17
- {{- block['text'] }}
18
- {%- else %}
19
- {{- raise_exception('Only text chunks are supported in system message contents.') }}
20
- {%- endif %}
21
- {%- endfor %}
22
- {%- endif %}
23
- {{- '[/SYSTEM_PROMPT]' -}}
24
- {%- set loop_messages = messages[1:] %}
25
- {%- else %}
26
- {%- set loop_messages = messages %}
27
- {%- if default_system_message != '' %}
28
- {{- '[SYSTEM_PROMPT]' + default_system_message + '[/SYSTEM_PROMPT]' }}
29
- {%- endif %}
30
- {%- endif %}
31
-
32
-
33
- {#- Tools definition #}
34
- {%- set tools_definition = '' %}
35
- {%- set has_tools = false %}
36
- {%- if tools is defined and tools is not none and tools|length > 0 %}
37
- {%- set has_tools = true %}
38
- {%- set tools_definition = '[AVAILABLE_TOOLS]' + (tools| tojson) + '[/AVAILABLE_TOOLS]' %}
39
- {{- tools_definition }}
40
- {%- endif %}
41
-
42
- {#- Checks for alternating user/assistant messages. #}
43
- {%- set ns = namespace(index=0) %}
44
- {%- for message in loop_messages %}
45
- {%- if message.role == 'user' or (message.role == 'assistant' and (message.tool_calls is not defined or message.tool_calls is none or message.tool_calls | length == 0)) %}
46
- {%- if (message['role'] == 'user') != (ns.index % 2 == 0) %}
47
- {{- raise_exception('After the optional system message, conversation roles must alternate user and assistant roles except for tool calls and results.') }}
48
- {%- endif %}
49
- {%- set ns.index = ns.index + 1 %}
50
- {%- endif %}
51
- {%- endfor %}
52
-
53
- {#- Handle conversation messages. #}
54
- {%- for message in loop_messages %}
55
-
56
- {#- User messages supports text content or text and image chunks. #}
57
- {%- if message['role'] == 'user' %}
58
- {%- if message['content'] is string %}
59
- {{- '[INST]' + message['content'] + '[/INST]' }}
60
- {%- elif message['content'] | length > 0 %}
61
- {{- '[INST]' }}
62
- {%- if message['content'] | length == 2 %}
63
- {%- set blocks = message['content'] | sort(attribute='type') %}
64
- {%- else %}
65
- {%- set blocks = message['content'] %}
66
- {%- endif %}
67
- {%- for block in blocks %}
68
- {%- if block['type'] == 'text' %}
69
- {{- block['text'] }}
70
- {%- elif block['type'] in ['image', 'image_url'] %}
71
- {{- '[IMG]' }}
72
- {%- else %}
73
- {{- raise_exception('Only text, image and image_url chunks are supported in user message content.') }}
74
- {%- endif %}
75
- {%- endfor %}
76
- {{- '[/INST]' }}
77
- {%- else %}
78
- {{- raise_exception('User message must have a string or a list of chunks in content') }}
79
- {%- endif %}
80
-
81
- {#- Assistant messages supports text content or text and image chunks. #}
82
- {%- elif message['role'] == 'assistant' %}
83
-
84
- {%- if message['content'] is string %}
85
- {{- message['content'] }}
86
- {%- elif message['content'] is iterable and message['content'] | length > 0 %}
87
- {%- for block in message['content'] %}
88
- {%- if block['type'] == 'text' %}
89
- {{- block['text'] }}
90
- {%- else %}
91
- {{- raise_exception('Only text chunks are supported in assistant message contents.') }}
92
- {%- endif %}
93
- {%- endfor %}
94
- {%- endif %}
95
-
96
- {%- if message['tool_calls'] is defined and message['tool_calls'] is not none and message['tool_calls']|length > 0 %}
97
- {%- for tool in message['tool_calls'] %}
98
- {%- set arguments = tool['function']['arguments'] %}
99
- {%- if arguments is not string %}
100
- {%- set arguments = arguments|tojson|safe %}
101
- {%- elif arguments == '' %}
102
- {%- set arguments = '{}' %}
103
- {%- endif %}
104
- {{- '[TOOL_CALLS]' + tool['function']['name'] + '[ARGS]' + arguments }}
105
- {%- endfor %}
106
- {%- endif %}
107
-
108
- {#- End of sequence token for each assistant messages. #}
109
- {{- eos_token }}
110
-
111
- {#- Tool messages only supports text content. #}
112
- {%- elif message['role'] == 'tool' %}
113
- {{- '[TOOL_RESULTS]' + message['content']|string + '[/TOOL_RESULTS]' }}
114
-
115
- {#- Raise exception for unsupported roles. #}
116
- {%- else %}
117
- {{- raise_exception('Only user, assistant and tool roles are supported, got ' + message['role']) }}
118
- {%- endif %}
119
- {%- endfor %}
120
-
121
- {#- Copyright 2025-present Unsloth. Apache 2.0 License. #}
 
1
+ {% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '
2
+ ' + message['content'] + '<|im_end|>' + '
3
+ '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
4
+ ' }}{% endif %}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,72 +1,72 @@
1
  {
2
  "architectures": [
3
- "Mistral3ForConditionalGeneration"
4
  ],
5
- "bos_token_id": 1,
 
6
  "dtype": "bfloat16",
7
- "eos_token_id": 2,
8
- "image_token_index": 10,
9
- "model_type": "mistral3",
10
- "multimodal_projector_bias": false,
11
- "pad_token_id": 11,
12
- "projector_hidden_act": "gelu",
13
- "spatial_merge_size": 2,
14
- "text_config": {
15
- "attention_dropout": 0.0,
16
- "bos_token_id": 1,
17
- "dtype": "bfloat16",
18
- "eos_token_id": 2,
19
- "head_dim": 128,
20
- "hidden_act": "silu",
21
- "hidden_size": 3072,
22
- "initializer_range": 0.02,
23
- "intermediate_size": 9216,
24
- "max_position_embeddings": 262144,
25
- "model_type": "ministral3",
26
- "num_attention_heads": 32,
27
- "num_hidden_layers": 26,
28
- "num_key_value_heads": 8,
29
- "pad_token_id": 11,
30
- "rms_norm_eps": 1e-05,
31
- "rope_parameters": {
32
- "beta_fast": 32.0,
33
- "beta_slow": 1.0,
34
- "factor": 16.0,
35
- "llama_4_scaling_beta": 0.1,
36
- "mscale": 1.0,
37
- "mscale_all_dim": 1.0,
38
- "original_max_position_embeddings": 16384,
39
- "rope_theta": 1000000.0,
40
- "rope_type": "yarn",
41
- "type": "yarn"
42
- },
43
- "sliding_window": null,
44
- "tie_word_embeddings": true,
45
- "use_cache": true,
46
- "vocab_size": 131072
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  },
48
- "transformers_version": "5.0.0.dev0",
 
 
49
  "unsloth_fixed": true,
50
  "unsloth_version": "2026.1.4",
51
  "use_cache": false,
52
- "vision_config": {
53
- "attention_dropout": 0.0,
54
- "dtype": "bfloat16",
55
- "head_dim": 64,
56
- "hidden_act": "silu",
57
- "hidden_size": 1024,
58
- "image_size": 1540,
59
- "initializer_range": 0.02,
60
- "intermediate_size": 4096,
61
- "model_type": "pixtral",
62
- "num_attention_heads": 16,
63
- "num_channels": 3,
64
- "num_hidden_layers": 24,
65
- "patch_size": 14,
66
- "rope_parameters": {
67
- "rope_theta": 10000.0,
68
- "rope_type": "default"
69
- }
70
- },
71
- "vision_feature_layer": -1
72
  }
 
1
  {
2
  "architectures": [
3
+ "Qwen3ForCausalLM"
4
  ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
  "dtype": "bfloat16",
8
+ "eos_token_id": 151643,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 2560,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 9728,
14
+ "layer_types": [
15
+ "full_attention",
16
+ "full_attention",
17
+ "full_attention",
18
+ "full_attention",
19
+ "full_attention",
20
+ "full_attention",
21
+ "full_attention",
22
+ "full_attention",
23
+ "full_attention",
24
+ "full_attention",
25
+ "full_attention",
26
+ "full_attention",
27
+ "full_attention",
28
+ "full_attention",
29
+ "full_attention",
30
+ "full_attention",
31
+ "full_attention",
32
+ "full_attention",
33
+ "full_attention",
34
+ "full_attention",
35
+ "full_attention",
36
+ "full_attention",
37
+ "full_attention",
38
+ "full_attention",
39
+ "full_attention",
40
+ "full_attention",
41
+ "full_attention",
42
+ "full_attention",
43
+ "full_attention",
44
+ "full_attention",
45
+ "full_attention",
46
+ "full_attention",
47
+ "full_attention",
48
+ "full_attention",
49
+ "full_attention",
50
+ "full_attention"
51
+ ],
52
+ "max_position_embeddings": 32768,
53
+ "max_window_layers": 36,
54
+ "model_type": "qwen3",
55
+ "num_attention_heads": 32,
56
+ "num_hidden_layers": 36,
57
+ "num_key_value_heads": 8,
58
+ "pad_token_id": 151654,
59
+ "rms_norm_eps": 1e-06,
60
+ "rope_parameters": {
61
+ "rope_theta": 1000000,
62
+ "rope_type": "default"
63
  },
64
+ "sliding_window": null,
65
+ "tie_word_embeddings": true,
66
+ "transformers_version": "5.0.0rc0",
67
  "unsloth_fixed": true,
68
  "unsloth_version": "2026.1.4",
69
  "use_cache": false,
70
+ "use_sliding_window": false,
71
+ "vocab_size": 151936
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  }
generation_config.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
- "bos_token_id": 1,
3
  "eos_token_id": [
4
- 2
5
  ],
6
- "max_length": 262144,
7
- "pad_token_id": 11,
8
- "transformers_version": "5.0.0.dev0"
 
9
  }
 
1
  {
 
2
  "eos_token_id": [
3
+ 151643
4
  ],
5
+ "max_length": 32768,
6
+ "max_new_tokens": 2048,
7
+ "pad_token_id": 151654,
8
+ "transformers_version": "5.0.0rc0"
9
  }
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:562dc6fdffc591f34a5316eed069f65e9b0e3e5f1fd740c3d8685d88c4a1af59
3
+ size 4990818672
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e9ab7efa4932f00eeee5395b3f9d73afeb505aa06b2f1d6a71ab351a88d3b84
3
+ size 3054163328
model.safetensors.index.json ADDED
@@ -0,0 +1,406 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_parameters": 4022468096,
4
+ "total_size": 8044936192
5
+ },
6
+ "weight_map": {
7
+ "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
8
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
9
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
10
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
11
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
12
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
13
+ "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
14
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
15
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
16
+ "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
17
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
18
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
19
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
20
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
21
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
22
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
23
+ "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
24
+ "model.layers.1.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
25
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
26
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
27
+ "model.layers.1.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
28
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
29
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
30
+ "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
31
+ "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
32
+ "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
33
+ "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
34
+ "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
35
+ "model.layers.10.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
36
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
37
+ "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
38
+ "model.layers.10.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
39
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
40
+ "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
41
+ "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
42
+ "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
43
+ "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
44
+ "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
45
+ "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
46
+ "model.layers.11.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
47
+ "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
48
+ "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
49
+ "model.layers.11.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
50
+ "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
51
+ "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
52
+ "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
53
+ "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
54
+ "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
55
+ "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
56
+ "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
57
+ "model.layers.12.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
58
+ "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
59
+ "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
60
+ "model.layers.12.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
61
+ "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
62
+ "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
63
+ "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
64
+ "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
65
+ "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
66
+ "model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
67
+ "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
68
+ "model.layers.13.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
69
+ "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
70
+ "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
71
+ "model.layers.13.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
72
+ "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
73
+ "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
74
+ "model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
75
+ "model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
76
+ "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
77
+ "model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
78
+ "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
79
+ "model.layers.14.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
80
+ "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
81
+ "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
82
+ "model.layers.14.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
83
+ "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
84
+ "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
85
+ "model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
86
+ "model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
87
+ "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
88
+ "model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
89
+ "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
90
+ "model.layers.15.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
91
+ "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
92
+ "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
93
+ "model.layers.15.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
94
+ "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
95
+ "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
96
+ "model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
97
+ "model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
98
+ "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
99
+ "model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
100
+ "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
101
+ "model.layers.16.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
102
+ "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
103
+ "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
104
+ "model.layers.16.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
105
+ "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
106
+ "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
107
+ "model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
108
+ "model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
109
+ "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
110
+ "model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
111
+ "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
112
+ "model.layers.17.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
113
+ "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
114
+ "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
115
+ "model.layers.17.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
116
+ "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
117
+ "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
118
+ "model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
119
+ "model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
120
+ "model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
121
+ "model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
122
+ "model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
123
+ "model.layers.18.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
124
+ "model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
125
+ "model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
126
+ "model.layers.18.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
127
+ "model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
128
+ "model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
129
+ "model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
130
+ "model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
131
+ "model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
132
+ "model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
133
+ "model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
134
+ "model.layers.19.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
135
+ "model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
136
+ "model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
137
+ "model.layers.19.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
138
+ "model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
139
+ "model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
140
+ "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
141
+ "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
142
+ "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
143
+ "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
144
+ "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
145
+ "model.layers.2.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
146
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
147
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
148
+ "model.layers.2.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
149
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
150
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
151
+ "model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors",
152
+ "model.layers.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
153
+ "model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
154
+ "model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
155
+ "model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
156
+ "model.layers.20.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
157
+ "model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
158
+ "model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
159
+ "model.layers.20.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
160
+ "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
161
+ "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
162
+ "model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
163
+ "model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
164
+ "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
165
+ "model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
166
+ "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
167
+ "model.layers.21.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
168
+ "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
169
+ "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
170
+ "model.layers.21.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
171
+ "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
172
+ "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
173
+ "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
174
+ "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
175
+ "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
176
+ "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
177
+ "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
178
+ "model.layers.22.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
179
+ "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
180
+ "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
181
+ "model.layers.22.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
182
+ "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
183
+ "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
184
+ "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
185
+ "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
186
+ "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
187
+ "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
188
+ "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
189
+ "model.layers.23.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
190
+ "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
191
+ "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
192
+ "model.layers.23.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
193
+ "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
194
+ "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
195
+ "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
196
+ "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
197
+ "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
198
+ "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
199
+ "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
200
+ "model.layers.24.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
201
+ "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
202
+ "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
203
+ "model.layers.24.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
204
+ "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
205
+ "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
206
+ "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
207
+ "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
208
+ "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
209
+ "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
210
+ "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
211
+ "model.layers.25.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
212
+ "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
213
+ "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
214
+ "model.layers.25.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
215
+ "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
216
+ "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
217
+ "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
218
+ "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
219
+ "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
220
+ "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
221
+ "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
222
+ "model.layers.26.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
223
+ "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
224
+ "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
225
+ "model.layers.26.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
226
+ "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
227
+ "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
228
+ "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
229
+ "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
230
+ "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
231
+ "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
232
+ "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
233
+ "model.layers.27.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
234
+ "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
235
+ "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
236
+ "model.layers.27.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
237
+ "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
238
+ "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
239
+ "model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
240
+ "model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
241
+ "model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
242
+ "model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
243
+ "model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
244
+ "model.layers.28.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
245
+ "model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
246
+ "model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
247
+ "model.layers.28.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
248
+ "model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
249
+ "model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
250
+ "model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
251
+ "model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
252
+ "model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
253
+ "model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
254
+ "model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
255
+ "model.layers.29.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
256
+ "model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
257
+ "model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
258
+ "model.layers.29.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
259
+ "model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
260
+ "model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
261
+ "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
262
+ "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
263
+ "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
264
+ "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
265
+ "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
266
+ "model.layers.3.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
267
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
268
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
269
+ "model.layers.3.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
270
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
271
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
272
+ "model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
273
+ "model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
274
+ "model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
275
+ "model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
276
+ "model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
277
+ "model.layers.30.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
278
+ "model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
279
+ "model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
280
+ "model.layers.30.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
281
+ "model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
282
+ "model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
283
+ "model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
284
+ "model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
285
+ "model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
286
+ "model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
287
+ "model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
288
+ "model.layers.31.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
289
+ "model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
290
+ "model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
291
+ "model.layers.31.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
292
+ "model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
293
+ "model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
294
+ "model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
295
+ "model.layers.32.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
296
+ "model.layers.32.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
297
+ "model.layers.32.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
298
+ "model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
299
+ "model.layers.32.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
300
+ "model.layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
301
+ "model.layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
302
+ "model.layers.32.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
303
+ "model.layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
304
+ "model.layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
305
+ "model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
306
+ "model.layers.33.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
307
+ "model.layers.33.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
308
+ "model.layers.33.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
309
+ "model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
310
+ "model.layers.33.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
311
+ "model.layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
312
+ "model.layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
313
+ "model.layers.33.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
314
+ "model.layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
315
+ "model.layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
316
+ "model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
317
+ "model.layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
318
+ "model.layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
319
+ "model.layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
320
+ "model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
321
+ "model.layers.34.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
322
+ "model.layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
323
+ "model.layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
324
+ "model.layers.34.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
325
+ "model.layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
326
+ "model.layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
327
+ "model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
328
+ "model.layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
329
+ "model.layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
330
+ "model.layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
331
+ "model.layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
332
+ "model.layers.35.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
333
+ "model.layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
334
+ "model.layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
335
+ "model.layers.35.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
336
+ "model.layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
337
+ "model.layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
338
+ "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
339
+ "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
340
+ "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
341
+ "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
342
+ "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
343
+ "model.layers.4.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
344
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
345
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
346
+ "model.layers.4.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
347
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
348
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
349
+ "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
350
+ "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
351
+ "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
352
+ "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
353
+ "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
354
+ "model.layers.5.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
355
+ "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
356
+ "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
357
+ "model.layers.5.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
358
+ "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
359
+ "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
360
+ "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
361
+ "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
362
+ "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
363
+ "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
364
+ "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
365
+ "model.layers.6.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
366
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
367
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
368
+ "model.layers.6.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
369
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
370
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
371
+ "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
372
+ "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
373
+ "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
374
+ "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
375
+ "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
376
+ "model.layers.7.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
377
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
378
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
379
+ "model.layers.7.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
380
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
381
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
382
+ "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
383
+ "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
384
+ "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
385
+ "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
386
+ "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
387
+ "model.layers.8.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
388
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
389
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
390
+ "model.layers.8.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
391
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
392
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
393
+ "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
394
+ "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
395
+ "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
396
+ "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
397
+ "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
398
+ "model.layers.9.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
399
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
400
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
401
+ "model.layers.9.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
402
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
403
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
404
+ "model.norm.weight": "model-00002-of-00002.safetensors"
405
+ }
406
+ }
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:577575622324b2e099e2648be26bdeb5e5815ffe66d7004e9e3ddbf421db6bf1
3
- size 17078110
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
3
+ size 11422650
tokenizer_config.json CHANGED
@@ -1,1018 +1,32 @@
1
  {
2
- "add_prefix_space": null,
 
3
  "backend": "tokenizers",
4
- "bos_token": "<s>",
5
  "clean_up_tokenization_spaces": false,
6
- "eos_token": "</s>",
 
7
  "extra_special_tokens": [
8
- "<unk>",
9
- "<s>",
10
- "</s>",
11
- "[INST]",
12
- "[/INST]",
13
- "[AVAILABLE_TOOLS]",
14
- "[/AVAILABLE_TOOLS]",
15
- "[TOOL_RESULTS]",
16
- "[/TOOL_RESULTS]",
17
- "[TOOL_CALLS]",
18
- "[IMG]",
19
- "<pad>",
20
- "[IMG_BREAK]",
21
- "[IMG_END]",
22
- "[PREFIX]",
23
- "[MIDDLE]",
24
- "[SUFFIX]",
25
- "[SYSTEM_PROMPT]",
26
- "[/SYSTEM_PROMPT]",
27
- "[TOOL_CONTENT]",
28
- "<SPECIAL_20>",
29
- "<SPECIAL_21>",
30
- "<SPECIAL_22>",
31
- "<SPECIAL_23>",
32
- "[AUDIO]",
33
- "[BEGIN_AUDIO]",
34
- "<SPECIAL_26>",
35
- "<SPECIAL_27>",
36
- "<SPECIAL_28>",
37
- "<SPECIAL_29>",
38
- "<SPECIAL_30>",
39
- "<SPECIAL_31>",
40
- "[ARGS]",
41
- "[CALL_ID]",
42
- "[THINK]",
43
- "[/THINK]",
44
- "<SPECIAL_36>",
45
- "<SPECIAL_37>",
46
- "<SPECIAL_38>",
47
- "<SPECIAL_39>",
48
- "<SPECIAL_40>",
49
- "<SPECIAL_41>",
50
- "<SPECIAL_42>",
51
- "<SPECIAL_43>",
52
- "<SPECIAL_44>",
53
- "<SPECIAL_45>",
54
- "<SPECIAL_46>",
55
- "<SPECIAL_47>",
56
- "<SPECIAL_48>",
57
- "<SPECIAL_49>",
58
- "<SPECIAL_50>",
59
- "<SPECIAL_51>",
60
- "<SPECIAL_52>",
61
- "<SPECIAL_53>",
62
- "<SPECIAL_54>",
63
- "<SPECIAL_55>",
64
- "<SPECIAL_56>",
65
- "<SPECIAL_57>",
66
- "<SPECIAL_58>",
67
- "<SPECIAL_59>",
68
- "<SPECIAL_60>",
69
- "<SPECIAL_61>",
70
- "<SPECIAL_62>",
71
- "<SPECIAL_63>",
72
- "<SPECIAL_64>",
73
- "<SPECIAL_65>",
74
- "<SPECIAL_66>",
75
- "<SPECIAL_67>",
76
- "<SPECIAL_68>",
77
- "<SPECIAL_69>",
78
- "<SPECIAL_70>",
79
- "<SPECIAL_71>",
80
- "<SPECIAL_72>",
81
- "<SPECIAL_73>",
82
- "<SPECIAL_74>",
83
- "<SPECIAL_75>",
84
- "<SPECIAL_76>",
85
- "<SPECIAL_77>",
86
- "<SPECIAL_78>",
87
- "<SPECIAL_79>",
88
- "<SPECIAL_80>",
89
- "<SPECIAL_81>",
90
- "<SPECIAL_82>",
91
- "<SPECIAL_83>",
92
- "<SPECIAL_84>",
93
- "<SPECIAL_85>",
94
- "<SPECIAL_86>",
95
- "<SPECIAL_87>",
96
- "<SPECIAL_88>",
97
- "<SPECIAL_89>",
98
- "<SPECIAL_90>",
99
- "<SPECIAL_91>",
100
- "<SPECIAL_92>",
101
- "<SPECIAL_93>",
102
- "<SPECIAL_94>",
103
- "<SPECIAL_95>",
104
- "<SPECIAL_96>",
105
- "<SPECIAL_97>",
106
- "<SPECIAL_98>",
107
- "<SPECIAL_99>",
108
- "<SPECIAL_100>",
109
- "<SPECIAL_101>",
110
- "<SPECIAL_102>",
111
- "<SPECIAL_103>",
112
- "<SPECIAL_104>",
113
- "<SPECIAL_105>",
114
- "<SPECIAL_106>",
115
- "<SPECIAL_107>",
116
- "<SPECIAL_108>",
117
- "<SPECIAL_109>",
118
- "<SPECIAL_110>",
119
- "<SPECIAL_111>",
120
- "<SPECIAL_112>",
121
- "<SPECIAL_113>",
122
- "<SPECIAL_114>",
123
- "<SPECIAL_115>",
124
- "<SPECIAL_116>",
125
- "<SPECIAL_117>",
126
- "<SPECIAL_118>",
127
- "<SPECIAL_119>",
128
- "<SPECIAL_120>",
129
- "<SPECIAL_121>",
130
- "<SPECIAL_122>",
131
- "<SPECIAL_123>",
132
- "<SPECIAL_124>",
133
- "<SPECIAL_125>",
134
- "<SPECIAL_126>",
135
- "<SPECIAL_127>",
136
- "<SPECIAL_128>",
137
- "<SPECIAL_129>",
138
- "<SPECIAL_130>",
139
- "<SPECIAL_131>",
140
- "<SPECIAL_132>",
141
- "<SPECIAL_133>",
142
- "<SPECIAL_134>",
143
- "<SPECIAL_135>",
144
- "<SPECIAL_136>",
145
- "<SPECIAL_137>",
146
- "<SPECIAL_138>",
147
- "<SPECIAL_139>",
148
- "<SPECIAL_140>",
149
- "<SPECIAL_141>",
150
- "<SPECIAL_142>",
151
- "<SPECIAL_143>",
152
- "<SPECIAL_144>",
153
- "<SPECIAL_145>",
154
- "<SPECIAL_146>",
155
- "<SPECIAL_147>",
156
- "<SPECIAL_148>",
157
- "<SPECIAL_149>",
158
- "<SPECIAL_150>",
159
- "<SPECIAL_151>",
160
- "<SPECIAL_152>",
161
- "<SPECIAL_153>",
162
- "<SPECIAL_154>",
163
- "<SPECIAL_155>",
164
- "<SPECIAL_156>",
165
- "<SPECIAL_157>",
166
- "<SPECIAL_158>",
167
- "<SPECIAL_159>",
168
- "<SPECIAL_160>",
169
- "<SPECIAL_161>",
170
- "<SPECIAL_162>",
171
- "<SPECIAL_163>",
172
- "<SPECIAL_164>",
173
- "<SPECIAL_165>",
174
- "<SPECIAL_166>",
175
- "<SPECIAL_167>",
176
- "<SPECIAL_168>",
177
- "<SPECIAL_169>",
178
- "<SPECIAL_170>",
179
- "<SPECIAL_171>",
180
- "<SPECIAL_172>",
181
- "<SPECIAL_173>",
182
- "<SPECIAL_174>",
183
- "<SPECIAL_175>",
184
- "<SPECIAL_176>",
185
- "<SPECIAL_177>",
186
- "<SPECIAL_178>",
187
- "<SPECIAL_179>",
188
- "<SPECIAL_180>",
189
- "<SPECIAL_181>",
190
- "<SPECIAL_182>",
191
- "<SPECIAL_183>",
192
- "<SPECIAL_184>",
193
- "<SPECIAL_185>",
194
- "<SPECIAL_186>",
195
- "<SPECIAL_187>",
196
- "<SPECIAL_188>",
197
- "<SPECIAL_189>",
198
- "<SPECIAL_190>",
199
- "<SPECIAL_191>",
200
- "<SPECIAL_192>",
201
- "<SPECIAL_193>",
202
- "<SPECIAL_194>",
203
- "<SPECIAL_195>",
204
- "<SPECIAL_196>",
205
- "<SPECIAL_197>",
206
- "<SPECIAL_198>",
207
- "<SPECIAL_199>",
208
- "<SPECIAL_200>",
209
- "<SPECIAL_201>",
210
- "<SPECIAL_202>",
211
- "<SPECIAL_203>",
212
- "<SPECIAL_204>",
213
- "<SPECIAL_205>",
214
- "<SPECIAL_206>",
215
- "<SPECIAL_207>",
216
- "<SPECIAL_208>",
217
- "<SPECIAL_209>",
218
- "<SPECIAL_210>",
219
- "<SPECIAL_211>",
220
- "<SPECIAL_212>",
221
- "<SPECIAL_213>",
222
- "<SPECIAL_214>",
223
- "<SPECIAL_215>",
224
- "<SPECIAL_216>",
225
- "<SPECIAL_217>",
226
- "<SPECIAL_218>",
227
- "<SPECIAL_219>",
228
- "<SPECIAL_220>",
229
- "<SPECIAL_221>",
230
- "<SPECIAL_222>",
231
- "<SPECIAL_223>",
232
- "<SPECIAL_224>",
233
- "<SPECIAL_225>",
234
- "<SPECIAL_226>",
235
- "<SPECIAL_227>",
236
- "<SPECIAL_228>",
237
- "<SPECIAL_229>",
238
- "<SPECIAL_230>",
239
- "<SPECIAL_231>",
240
- "<SPECIAL_232>",
241
- "<SPECIAL_233>",
242
- "<SPECIAL_234>",
243
- "<SPECIAL_235>",
244
- "<SPECIAL_236>",
245
- "<SPECIAL_237>",
246
- "<SPECIAL_238>",
247
- "<SPECIAL_239>",
248
- "<SPECIAL_240>",
249
- "<SPECIAL_241>",
250
- "<SPECIAL_242>",
251
- "<SPECIAL_243>",
252
- "<SPECIAL_244>",
253
- "<SPECIAL_245>",
254
- "<SPECIAL_246>",
255
- "<SPECIAL_247>",
256
- "<SPECIAL_248>",
257
- "<SPECIAL_249>",
258
- "<SPECIAL_250>",
259
- "<SPECIAL_251>",
260
- "<SPECIAL_252>",
261
- "<SPECIAL_253>",
262
- "<SPECIAL_254>",
263
- "<SPECIAL_255>",
264
- "<SPECIAL_256>",
265
- "<SPECIAL_257>",
266
- "<SPECIAL_258>",
267
- "<SPECIAL_259>",
268
- "<SPECIAL_260>",
269
- "<SPECIAL_261>",
270
- "<SPECIAL_262>",
271
- "<SPECIAL_263>",
272
- "<SPECIAL_264>",
273
- "<SPECIAL_265>",
274
- "<SPECIAL_266>",
275
- "<SPECIAL_267>",
276
- "<SPECIAL_268>",
277
- "<SPECIAL_269>",
278
- "<SPECIAL_270>",
279
- "<SPECIAL_271>",
280
- "<SPECIAL_272>",
281
- "<SPECIAL_273>",
282
- "<SPECIAL_274>",
283
- "<SPECIAL_275>",
284
- "<SPECIAL_276>",
285
- "<SPECIAL_277>",
286
- "<SPECIAL_278>",
287
- "<SPECIAL_279>",
288
- "<SPECIAL_280>",
289
- "<SPECIAL_281>",
290
- "<SPECIAL_282>",
291
- "<SPECIAL_283>",
292
- "<SPECIAL_284>",
293
- "<SPECIAL_285>",
294
- "<SPECIAL_286>",
295
- "<SPECIAL_287>",
296
- "<SPECIAL_288>",
297
- "<SPECIAL_289>",
298
- "<SPECIAL_290>",
299
- "<SPECIAL_291>",
300
- "<SPECIAL_292>",
301
- "<SPECIAL_293>",
302
- "<SPECIAL_294>",
303
- "<SPECIAL_295>",
304
- "<SPECIAL_296>",
305
- "<SPECIAL_297>",
306
- "<SPECIAL_298>",
307
- "<SPECIAL_299>",
308
- "<SPECIAL_300>",
309
- "<SPECIAL_301>",
310
- "<SPECIAL_302>",
311
- "<SPECIAL_303>",
312
- "<SPECIAL_304>",
313
- "<SPECIAL_305>",
314
- "<SPECIAL_306>",
315
- "<SPECIAL_307>",
316
- "<SPECIAL_308>",
317
- "<SPECIAL_309>",
318
- "<SPECIAL_310>",
319
- "<SPECIAL_311>",
320
- "<SPECIAL_312>",
321
- "<SPECIAL_313>",
322
- "<SPECIAL_314>",
323
- "<SPECIAL_315>",
324
- "<SPECIAL_316>",
325
- "<SPECIAL_317>",
326
- "<SPECIAL_318>",
327
- "<SPECIAL_319>",
328
- "<SPECIAL_320>",
329
- "<SPECIAL_321>",
330
- "<SPECIAL_322>",
331
- "<SPECIAL_323>",
332
- "<SPECIAL_324>",
333
- "<SPECIAL_325>",
334
- "<SPECIAL_326>",
335
- "<SPECIAL_327>",
336
- "<SPECIAL_328>",
337
- "<SPECIAL_329>",
338
- "<SPECIAL_330>",
339
- "<SPECIAL_331>",
340
- "<SPECIAL_332>",
341
- "<SPECIAL_333>",
342
- "<SPECIAL_334>",
343
- "<SPECIAL_335>",
344
- "<SPECIAL_336>",
345
- "<SPECIAL_337>",
346
- "<SPECIAL_338>",
347
- "<SPECIAL_339>",
348
- "<SPECIAL_340>",
349
- "<SPECIAL_341>",
350
- "<SPECIAL_342>",
351
- "<SPECIAL_343>",
352
- "<SPECIAL_344>",
353
- "<SPECIAL_345>",
354
- "<SPECIAL_346>",
355
- "<SPECIAL_347>",
356
- "<SPECIAL_348>",
357
- "<SPECIAL_349>",
358
- "<SPECIAL_350>",
359
- "<SPECIAL_351>",
360
- "<SPECIAL_352>",
361
- "<SPECIAL_353>",
362
- "<SPECIAL_354>",
363
- "<SPECIAL_355>",
364
- "<SPECIAL_356>",
365
- "<SPECIAL_357>",
366
- "<SPECIAL_358>",
367
- "<SPECIAL_359>",
368
- "<SPECIAL_360>",
369
- "<SPECIAL_361>",
370
- "<SPECIAL_362>",
371
- "<SPECIAL_363>",
372
- "<SPECIAL_364>",
373
- "<SPECIAL_365>",
374
- "<SPECIAL_366>",
375
- "<SPECIAL_367>",
376
- "<SPECIAL_368>",
377
- "<SPECIAL_369>",
378
- "<SPECIAL_370>",
379
- "<SPECIAL_371>",
380
- "<SPECIAL_372>",
381
- "<SPECIAL_373>",
382
- "<SPECIAL_374>",
383
- "<SPECIAL_375>",
384
- "<SPECIAL_376>",
385
- "<SPECIAL_377>",
386
- "<SPECIAL_378>",
387
- "<SPECIAL_379>",
388
- "<SPECIAL_380>",
389
- "<SPECIAL_381>",
390
- "<SPECIAL_382>",
391
- "<SPECIAL_383>",
392
- "<SPECIAL_384>",
393
- "<SPECIAL_385>",
394
- "<SPECIAL_386>",
395
- "<SPECIAL_387>",
396
- "<SPECIAL_388>",
397
- "<SPECIAL_389>",
398
- "<SPECIAL_390>",
399
- "<SPECIAL_391>",
400
- "<SPECIAL_392>",
401
- "<SPECIAL_393>",
402
- "<SPECIAL_394>",
403
- "<SPECIAL_395>",
404
- "<SPECIAL_396>",
405
- "<SPECIAL_397>",
406
- "<SPECIAL_398>",
407
- "<SPECIAL_399>",
408
- "<SPECIAL_400>",
409
- "<SPECIAL_401>",
410
- "<SPECIAL_402>",
411
- "<SPECIAL_403>",
412
- "<SPECIAL_404>",
413
- "<SPECIAL_405>",
414
- "<SPECIAL_406>",
415
- "<SPECIAL_407>",
416
- "<SPECIAL_408>",
417
- "<SPECIAL_409>",
418
- "<SPECIAL_410>",
419
- "<SPECIAL_411>",
420
- "<SPECIAL_412>",
421
- "<SPECIAL_413>",
422
- "<SPECIAL_414>",
423
- "<SPECIAL_415>",
424
- "<SPECIAL_416>",
425
- "<SPECIAL_417>",
426
- "<SPECIAL_418>",
427
- "<SPECIAL_419>",
428
- "<SPECIAL_420>",
429
- "<SPECIAL_421>",
430
- "<SPECIAL_422>",
431
- "<SPECIAL_423>",
432
- "<SPECIAL_424>",
433
- "<SPECIAL_425>",
434
- "<SPECIAL_426>",
435
- "<SPECIAL_427>",
436
- "<SPECIAL_428>",
437
- "<SPECIAL_429>",
438
- "<SPECIAL_430>",
439
- "<SPECIAL_431>",
440
- "<SPECIAL_432>",
441
- "<SPECIAL_433>",
442
- "<SPECIAL_434>",
443
- "<SPECIAL_435>",
444
- "<SPECIAL_436>",
445
- "<SPECIAL_437>",
446
- "<SPECIAL_438>",
447
- "<SPECIAL_439>",
448
- "<SPECIAL_440>",
449
- "<SPECIAL_441>",
450
- "<SPECIAL_442>",
451
- "<SPECIAL_443>",
452
- "<SPECIAL_444>",
453
- "<SPECIAL_445>",
454
- "<SPECIAL_446>",
455
- "<SPECIAL_447>",
456
- "<SPECIAL_448>",
457
- "<SPECIAL_449>",
458
- "<SPECIAL_450>",
459
- "<SPECIAL_451>",
460
- "<SPECIAL_452>",
461
- "<SPECIAL_453>",
462
- "<SPECIAL_454>",
463
- "<SPECIAL_455>",
464
- "<SPECIAL_456>",
465
- "<SPECIAL_457>",
466
- "<SPECIAL_458>",
467
- "<SPECIAL_459>",
468
- "<SPECIAL_460>",
469
- "<SPECIAL_461>",
470
- "<SPECIAL_462>",
471
- "<SPECIAL_463>",
472
- "<SPECIAL_464>",
473
- "<SPECIAL_465>",
474
- "<SPECIAL_466>",
475
- "<SPECIAL_467>",
476
- "<SPECIAL_468>",
477
- "<SPECIAL_469>",
478
- "<SPECIAL_470>",
479
- "<SPECIAL_471>",
480
- "<SPECIAL_472>",
481
- "<SPECIAL_473>",
482
- "<SPECIAL_474>",
483
- "<SPECIAL_475>",
484
- "<SPECIAL_476>",
485
- "<SPECIAL_477>",
486
- "<SPECIAL_478>",
487
- "<SPECIAL_479>",
488
- "<SPECIAL_480>",
489
- "<SPECIAL_481>",
490
- "<SPECIAL_482>",
491
- "<SPECIAL_483>",
492
- "<SPECIAL_484>",
493
- "<SPECIAL_485>",
494
- "<SPECIAL_486>",
495
- "<SPECIAL_487>",
496
- "<SPECIAL_488>",
497
- "<SPECIAL_489>",
498
- "<SPECIAL_490>",
499
- "<SPECIAL_491>",
500
- "<SPECIAL_492>",
501
- "<SPECIAL_493>",
502
- "<SPECIAL_494>",
503
- "<SPECIAL_495>",
504
- "<SPECIAL_496>",
505
- "<SPECIAL_497>",
506
- "<SPECIAL_498>",
507
- "<SPECIAL_499>",
508
- "<SPECIAL_500>",
509
- "<SPECIAL_501>",
510
- "<SPECIAL_502>",
511
- "<SPECIAL_503>",
512
- "<SPECIAL_504>",
513
- "<SPECIAL_505>",
514
- "<SPECIAL_506>",
515
- "<SPECIAL_507>",
516
- "<SPECIAL_508>",
517
- "<SPECIAL_509>",
518
- "<SPECIAL_510>",
519
- "<SPECIAL_511>",
520
- "<SPECIAL_512>",
521
- "<SPECIAL_513>",
522
- "<SPECIAL_514>",
523
- "<SPECIAL_515>",
524
- "<SPECIAL_516>",
525
- "<SPECIAL_517>",
526
- "<SPECIAL_518>",
527
- "<SPECIAL_519>",
528
- "<SPECIAL_520>",
529
- "<SPECIAL_521>",
530
- "<SPECIAL_522>",
531
- "<SPECIAL_523>",
532
- "<SPECIAL_524>",
533
- "<SPECIAL_525>",
534
- "<SPECIAL_526>",
535
- "<SPECIAL_527>",
536
- "<SPECIAL_528>",
537
- "<SPECIAL_529>",
538
- "<SPECIAL_530>",
539
- "<SPECIAL_531>",
540
- "<SPECIAL_532>",
541
- "<SPECIAL_533>",
542
- "<SPECIAL_534>",
543
- "<SPECIAL_535>",
544
- "<SPECIAL_536>",
545
- "<SPECIAL_537>",
546
- "<SPECIAL_538>",
547
- "<SPECIAL_539>",
548
- "<SPECIAL_540>",
549
- "<SPECIAL_541>",
550
- "<SPECIAL_542>",
551
- "<SPECIAL_543>",
552
- "<SPECIAL_544>",
553
- "<SPECIAL_545>",
554
- "<SPECIAL_546>",
555
- "<SPECIAL_547>",
556
- "<SPECIAL_548>",
557
- "<SPECIAL_549>",
558
- "<SPECIAL_550>",
559
- "<SPECIAL_551>",
560
- "<SPECIAL_552>",
561
- "<SPECIAL_553>",
562
- "<SPECIAL_554>",
563
- "<SPECIAL_555>",
564
- "<SPECIAL_556>",
565
- "<SPECIAL_557>",
566
- "<SPECIAL_558>",
567
- "<SPECIAL_559>",
568
- "<SPECIAL_560>",
569
- "<SPECIAL_561>",
570
- "<SPECIAL_562>",
571
- "<SPECIAL_563>",
572
- "<SPECIAL_564>",
573
- "<SPECIAL_565>",
574
- "<SPECIAL_566>",
575
- "<SPECIAL_567>",
576
- "<SPECIAL_568>",
577
- "<SPECIAL_569>",
578
- "<SPECIAL_570>",
579
- "<SPECIAL_571>",
580
- "<SPECIAL_572>",
581
- "<SPECIAL_573>",
582
- "<SPECIAL_574>",
583
- "<SPECIAL_575>",
584
- "<SPECIAL_576>",
585
- "<SPECIAL_577>",
586
- "<SPECIAL_578>",
587
- "<SPECIAL_579>",
588
- "<SPECIAL_580>",
589
- "<SPECIAL_581>",
590
- "<SPECIAL_582>",
591
- "<SPECIAL_583>",
592
- "<SPECIAL_584>",
593
- "<SPECIAL_585>",
594
- "<SPECIAL_586>",
595
- "<SPECIAL_587>",
596
- "<SPECIAL_588>",
597
- "<SPECIAL_589>",
598
- "<SPECIAL_590>",
599
- "<SPECIAL_591>",
600
- "<SPECIAL_592>",
601
- "<SPECIAL_593>",
602
- "<SPECIAL_594>",
603
- "<SPECIAL_595>",
604
- "<SPECIAL_596>",
605
- "<SPECIAL_597>",
606
- "<SPECIAL_598>",
607
- "<SPECIAL_599>",
608
- "<SPECIAL_600>",
609
- "<SPECIAL_601>",
610
- "<SPECIAL_602>",
611
- "<SPECIAL_603>",
612
- "<SPECIAL_604>",
613
- "<SPECIAL_605>",
614
- "<SPECIAL_606>",
615
- "<SPECIAL_607>",
616
- "<SPECIAL_608>",
617
- "<SPECIAL_609>",
618
- "<SPECIAL_610>",
619
- "<SPECIAL_611>",
620
- "<SPECIAL_612>",
621
- "<SPECIAL_613>",
622
- "<SPECIAL_614>",
623
- "<SPECIAL_615>",
624
- "<SPECIAL_616>",
625
- "<SPECIAL_617>",
626
- "<SPECIAL_618>",
627
- "<SPECIAL_619>",
628
- "<SPECIAL_620>",
629
- "<SPECIAL_621>",
630
- "<SPECIAL_622>",
631
- "<SPECIAL_623>",
632
- "<SPECIAL_624>",
633
- "<SPECIAL_625>",
634
- "<SPECIAL_626>",
635
- "<SPECIAL_627>",
636
- "<SPECIAL_628>",
637
- "<SPECIAL_629>",
638
- "<SPECIAL_630>",
639
- "<SPECIAL_631>",
640
- "<SPECIAL_632>",
641
- "<SPECIAL_633>",
642
- "<SPECIAL_634>",
643
- "<SPECIAL_635>",
644
- "<SPECIAL_636>",
645
- "<SPECIAL_637>",
646
- "<SPECIAL_638>",
647
- "<SPECIAL_639>",
648
- "<SPECIAL_640>",
649
- "<SPECIAL_641>",
650
- "<SPECIAL_642>",
651
- "<SPECIAL_643>",
652
- "<SPECIAL_644>",
653
- "<SPECIAL_645>",
654
- "<SPECIAL_646>",
655
- "<SPECIAL_647>",
656
- "<SPECIAL_648>",
657
- "<SPECIAL_649>",
658
- "<SPECIAL_650>",
659
- "<SPECIAL_651>",
660
- "<SPECIAL_652>",
661
- "<SPECIAL_653>",
662
- "<SPECIAL_654>",
663
- "<SPECIAL_655>",
664
- "<SPECIAL_656>",
665
- "<SPECIAL_657>",
666
- "<SPECIAL_658>",
667
- "<SPECIAL_659>",
668
- "<SPECIAL_660>",
669
- "<SPECIAL_661>",
670
- "<SPECIAL_662>",
671
- "<SPECIAL_663>",
672
- "<SPECIAL_664>",
673
- "<SPECIAL_665>",
674
- "<SPECIAL_666>",
675
- "<SPECIAL_667>",
676
- "<SPECIAL_668>",
677
- "<SPECIAL_669>",
678
- "<SPECIAL_670>",
679
- "<SPECIAL_671>",
680
- "<SPECIAL_672>",
681
- "<SPECIAL_673>",
682
- "<SPECIAL_674>",
683
- "<SPECIAL_675>",
684
- "<SPECIAL_676>",
685
- "<SPECIAL_677>",
686
- "<SPECIAL_678>",
687
- "<SPECIAL_679>",
688
- "<SPECIAL_680>",
689
- "<SPECIAL_681>",
690
- "<SPECIAL_682>",
691
- "<SPECIAL_683>",
692
- "<SPECIAL_684>",
693
- "<SPECIAL_685>",
694
- "<SPECIAL_686>",
695
- "<SPECIAL_687>",
696
- "<SPECIAL_688>",
697
- "<SPECIAL_689>",
698
- "<SPECIAL_690>",
699
- "<SPECIAL_691>",
700
- "<SPECIAL_692>",
701
- "<SPECIAL_693>",
702
- "<SPECIAL_694>",
703
- "<SPECIAL_695>",
704
- "<SPECIAL_696>",
705
- "<SPECIAL_697>",
706
- "<SPECIAL_698>",
707
- "<SPECIAL_699>",
708
- "<SPECIAL_700>",
709
- "<SPECIAL_701>",
710
- "<SPECIAL_702>",
711
- "<SPECIAL_703>",
712
- "<SPECIAL_704>",
713
- "<SPECIAL_705>",
714
- "<SPECIAL_706>",
715
- "<SPECIAL_707>",
716
- "<SPECIAL_708>",
717
- "<SPECIAL_709>",
718
- "<SPECIAL_710>",
719
- "<SPECIAL_711>",
720
- "<SPECIAL_712>",
721
- "<SPECIAL_713>",
722
- "<SPECIAL_714>",
723
- "<SPECIAL_715>",
724
- "<SPECIAL_716>",
725
- "<SPECIAL_717>",
726
- "<SPECIAL_718>",
727
- "<SPECIAL_719>",
728
- "<SPECIAL_720>",
729
- "<SPECIAL_721>",
730
- "<SPECIAL_722>",
731
- "<SPECIAL_723>",
732
- "<SPECIAL_724>",
733
- "<SPECIAL_725>",
734
- "<SPECIAL_726>",
735
- "<SPECIAL_727>",
736
- "<SPECIAL_728>",
737
- "<SPECIAL_729>",
738
- "<SPECIAL_730>",
739
- "<SPECIAL_731>",
740
- "<SPECIAL_732>",
741
- "<SPECIAL_733>",
742
- "<SPECIAL_734>",
743
- "<SPECIAL_735>",
744
- "<SPECIAL_736>",
745
- "<SPECIAL_737>",
746
- "<SPECIAL_738>",
747
- "<SPECIAL_739>",
748
- "<SPECIAL_740>",
749
- "<SPECIAL_741>",
750
- "<SPECIAL_742>",
751
- "<SPECIAL_743>",
752
- "<SPECIAL_744>",
753
- "<SPECIAL_745>",
754
- "<SPECIAL_746>",
755
- "<SPECIAL_747>",
756
- "<SPECIAL_748>",
757
- "<SPECIAL_749>",
758
- "<SPECIAL_750>",
759
- "<SPECIAL_751>",
760
- "<SPECIAL_752>",
761
- "<SPECIAL_753>",
762
- "<SPECIAL_754>",
763
- "<SPECIAL_755>",
764
- "<SPECIAL_756>",
765
- "<SPECIAL_757>",
766
- "<SPECIAL_758>",
767
- "<SPECIAL_759>",
768
- "<SPECIAL_760>",
769
- "<SPECIAL_761>",
770
- "<SPECIAL_762>",
771
- "<SPECIAL_763>",
772
- "<SPECIAL_764>",
773
- "<SPECIAL_765>",
774
- "<SPECIAL_766>",
775
- "<SPECIAL_767>",
776
- "<SPECIAL_768>",
777
- "<SPECIAL_769>",
778
- "<SPECIAL_770>",
779
- "<SPECIAL_771>",
780
- "<SPECIAL_772>",
781
- "<SPECIAL_773>",
782
- "<SPECIAL_774>",
783
- "<SPECIAL_775>",
784
- "<SPECIAL_776>",
785
- "<SPECIAL_777>",
786
- "<SPECIAL_778>",
787
- "<SPECIAL_779>",
788
- "<SPECIAL_780>",
789
- "<SPECIAL_781>",
790
- "<SPECIAL_782>",
791
- "<SPECIAL_783>",
792
- "<SPECIAL_784>",
793
- "<SPECIAL_785>",
794
- "<SPECIAL_786>",
795
- "<SPECIAL_787>",
796
- "<SPECIAL_788>",
797
- "<SPECIAL_789>",
798
- "<SPECIAL_790>",
799
- "<SPECIAL_791>",
800
- "<SPECIAL_792>",
801
- "<SPECIAL_793>",
802
- "<SPECIAL_794>",
803
- "<SPECIAL_795>",
804
- "<SPECIAL_796>",
805
- "<SPECIAL_797>",
806
- "<SPECIAL_798>",
807
- "<SPECIAL_799>",
808
- "<SPECIAL_800>",
809
- "<SPECIAL_801>",
810
- "<SPECIAL_802>",
811
- "<SPECIAL_803>",
812
- "<SPECIAL_804>",
813
- "<SPECIAL_805>",
814
- "<SPECIAL_806>",
815
- "<SPECIAL_807>",
816
- "<SPECIAL_808>",
817
- "<SPECIAL_809>",
818
- "<SPECIAL_810>",
819
- "<SPECIAL_811>",
820
- "<SPECIAL_812>",
821
- "<SPECIAL_813>",
822
- "<SPECIAL_814>",
823
- "<SPECIAL_815>",
824
- "<SPECIAL_816>",
825
- "<SPECIAL_817>",
826
- "<SPECIAL_818>",
827
- "<SPECIAL_819>",
828
- "<SPECIAL_820>",
829
- "<SPECIAL_821>",
830
- "<SPECIAL_822>",
831
- "<SPECIAL_823>",
832
- "<SPECIAL_824>",
833
- "<SPECIAL_825>",
834
- "<SPECIAL_826>",
835
- "<SPECIAL_827>",
836
- "<SPECIAL_828>",
837
- "<SPECIAL_829>",
838
- "<SPECIAL_830>",
839
- "<SPECIAL_831>",
840
- "<SPECIAL_832>",
841
- "<SPECIAL_833>",
842
- "<SPECIAL_834>",
843
- "<SPECIAL_835>",
844
- "<SPECIAL_836>",
845
- "<SPECIAL_837>",
846
- "<SPECIAL_838>",
847
- "<SPECIAL_839>",
848
- "<SPECIAL_840>",
849
- "<SPECIAL_841>",
850
- "<SPECIAL_842>",
851
- "<SPECIAL_843>",
852
- "<SPECIAL_844>",
853
- "<SPECIAL_845>",
854
- "<SPECIAL_846>",
855
- "<SPECIAL_847>",
856
- "<SPECIAL_848>",
857
- "<SPECIAL_849>",
858
- "<SPECIAL_850>",
859
- "<SPECIAL_851>",
860
- "<SPECIAL_852>",
861
- "<SPECIAL_853>",
862
- "<SPECIAL_854>",
863
- "<SPECIAL_855>",
864
- "<SPECIAL_856>",
865
- "<SPECIAL_857>",
866
- "<SPECIAL_858>",
867
- "<SPECIAL_859>",
868
- "<SPECIAL_860>",
869
- "<SPECIAL_861>",
870
- "<SPECIAL_862>",
871
- "<SPECIAL_863>",
872
- "<SPECIAL_864>",
873
- "<SPECIAL_865>",
874
- "<SPECIAL_866>",
875
- "<SPECIAL_867>",
876
- "<SPECIAL_868>",
877
- "<SPECIAL_869>",
878
- "<SPECIAL_870>",
879
- "<SPECIAL_871>",
880
- "<SPECIAL_872>",
881
- "<SPECIAL_873>",
882
- "<SPECIAL_874>",
883
- "<SPECIAL_875>",
884
- "<SPECIAL_876>",
885
- "<SPECIAL_877>",
886
- "<SPECIAL_878>",
887
- "<SPECIAL_879>",
888
- "<SPECIAL_880>",
889
- "<SPECIAL_881>",
890
- "<SPECIAL_882>",
891
- "<SPECIAL_883>",
892
- "<SPECIAL_884>",
893
- "<SPECIAL_885>",
894
- "<SPECIAL_886>",
895
- "<SPECIAL_887>",
896
- "<SPECIAL_888>",
897
- "<SPECIAL_889>",
898
- "<SPECIAL_890>",
899
- "<SPECIAL_891>",
900
- "<SPECIAL_892>",
901
- "<SPECIAL_893>",
902
- "<SPECIAL_894>",
903
- "<SPECIAL_895>",
904
- "<SPECIAL_896>",
905
- "<SPECIAL_897>",
906
- "<SPECIAL_898>",
907
- "<SPECIAL_899>",
908
- "<SPECIAL_900>",
909
- "<SPECIAL_901>",
910
- "<SPECIAL_902>",
911
- "<SPECIAL_903>",
912
- "<SPECIAL_904>",
913
- "<SPECIAL_905>",
914
- "<SPECIAL_906>",
915
- "<SPECIAL_907>",
916
- "<SPECIAL_908>",
917
- "<SPECIAL_909>",
918
- "<SPECIAL_910>",
919
- "<SPECIAL_911>",
920
- "<SPECIAL_912>",
921
- "<SPECIAL_913>",
922
- "<SPECIAL_914>",
923
- "<SPECIAL_915>",
924
- "<SPECIAL_916>",
925
- "<SPECIAL_917>",
926
- "<SPECIAL_918>",
927
- "<SPECIAL_919>",
928
- "<SPECIAL_920>",
929
- "<SPECIAL_921>",
930
- "<SPECIAL_922>",
931
- "<SPECIAL_923>",
932
- "<SPECIAL_924>",
933
- "<SPECIAL_925>",
934
- "<SPECIAL_926>",
935
- "<SPECIAL_927>",
936
- "<SPECIAL_928>",
937
- "<SPECIAL_929>",
938
- "<SPECIAL_930>",
939
- "<SPECIAL_931>",
940
- "<SPECIAL_932>",
941
- "<SPECIAL_933>",
942
- "<SPECIAL_934>",
943
- "<SPECIAL_935>",
944
- "<SPECIAL_936>",
945
- "<SPECIAL_937>",
946
- "<SPECIAL_938>",
947
- "<SPECIAL_939>",
948
- "<SPECIAL_940>",
949
- "<SPECIAL_941>",
950
- "<SPECIAL_942>",
951
- "<SPECIAL_943>",
952
- "<SPECIAL_944>",
953
- "<SPECIAL_945>",
954
- "<SPECIAL_946>",
955
- "<SPECIAL_947>",
956
- "<SPECIAL_948>",
957
- "<SPECIAL_949>",
958
- "<SPECIAL_950>",
959
- "<SPECIAL_951>",
960
- "<SPECIAL_952>",
961
- "<SPECIAL_953>",
962
- "<SPECIAL_954>",
963
- "<SPECIAL_955>",
964
- "<SPECIAL_956>",
965
- "<SPECIAL_957>",
966
- "<SPECIAL_958>",
967
- "<SPECIAL_959>",
968
- "<SPECIAL_960>",
969
- "<SPECIAL_961>",
970
- "<SPECIAL_962>",
971
- "<SPECIAL_963>",
972
- "<SPECIAL_964>",
973
- "<SPECIAL_965>",
974
- "<SPECIAL_966>",
975
- "<SPECIAL_967>",
976
- "<SPECIAL_968>",
977
- "<SPECIAL_969>",
978
- "<SPECIAL_970>",
979
- "<SPECIAL_971>",
980
- "<SPECIAL_972>",
981
- "<SPECIAL_973>",
982
- "<SPECIAL_974>",
983
- "<SPECIAL_975>",
984
- "<SPECIAL_976>",
985
- "<SPECIAL_977>",
986
- "<SPECIAL_978>",
987
- "<SPECIAL_979>",
988
- "<SPECIAL_980>",
989
- "<SPECIAL_981>",
990
- "<SPECIAL_982>",
991
- "<SPECIAL_983>",
992
- "<SPECIAL_984>",
993
- "<SPECIAL_985>",
994
- "<SPECIAL_986>",
995
- "<SPECIAL_987>",
996
- "<SPECIAL_988>",
997
- "<SPECIAL_989>",
998
- "<SPECIAL_990>",
999
- "<SPECIAL_991>",
1000
- "<SPECIAL_992>",
1001
- "<SPECIAL_993>",
1002
- "<SPECIAL_994>",
1003
- "<SPECIAL_995>",
1004
- "<SPECIAL_996>",
1005
- "<SPECIAL_997>",
1006
- "<SPECIAL_998>",
1007
- "<SPECIAL_999>"
1008
  ],
1009
- "is_local": false,
1010
- "model_max_length": 262144,
1011
  "model_specific_special_tokens": {},
1012
- "pad_token": "<pad>",
1013
  "padding_side": "right",
1014
- "processor_class": "PixtralProcessor",
1015
- "tokenizer_class": "TokenizersBackend",
1016
- "unk_token": "<unk>",
1017
- "use_default_system_prompt": false
1018
  }
 
1
  {
2
+ "add_prefix_space": false,
3
+ "additional_special_tokens": null,
4
  "backend": "tokenizers",
5
+ "bos_token": null,
6
  "clean_up_tokenization_spaces": false,
7
+ "eos_token": "<|endoftext|>",
8
+ "errors": "replace",
9
  "extra_special_tokens": [
10
+ "<|im_start|>",
11
+ "<|im_end|>",
12
+ "<|object_ref_start|>",
13
+ "<|object_ref_end|>",
14
+ "<|box_start|>",
15
+ "<|box_end|>",
16
+ "<|quad_start|>",
17
+ "<|quad_end|>",
18
+ "<|vision_start|>",
19
+ "<|vision_end|>",
20
+ "<|vision_pad|>",
21
+ "<|image_pad|>",
22
+ "<|video_pad|>"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ],
24
+ "is_local": true,
25
+ "model_max_length": 32768,
26
  "model_specific_special_tokens": {},
27
+ "pad_token": "<|vision_pad|>",
28
  "padding_side": "right",
29
+ "split_special_tokens": false,
30
+ "tokenizer_class": "Qwen2Tokenizer",
31
+ "unk_token": null
 
32
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ba6238a3a64655887dcb961bda9bb85062dfb0e72c2733bd6f8c8557449df766
3
  size 5713
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25a337fecbc72e6682bf3825f64d837b02914079469b6ef65ca41926f41735f3
3
  size 5713