danielcherubini commited on
Commit
2ae9f36
·
verified ·
1 Parent(s): 0bebe4e

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,139 +1,209 @@
1
  ---
2
- license: apache-2.0
3
  base_model: Qwen/Qwen3.5-9B
4
- tags:
5
- - qwen3.5
6
- - code
7
- - tool-calling
8
- - lora
9
- - sft
10
- - unsloth
11
- - reasoning
12
- - chain-of-thought
13
- datasets:
14
- - nohurry/Opus-4.6-Reasoning-3000x-filtered
15
- - Roman1111111/claude-opus-4.6-10000x
16
- - TeichAI/claude-4.5-opus-high-reasoning-250x
17
- - Jackrong/Qwen3.5-reasoning-700x
18
- - togethercomputer/CoderForge-Preview
19
- language:
20
- - en
21
  pipeline_tag: text-generation
 
 
 
 
 
 
22
  ---
23
 
24
- # Qwen3.5-DeltaCoder-9B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- A LoRA fine-tune of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
27
 
28
- The fine-tune builds on top of [Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2), a reasoning distillation of Qwen3.5-9B trained on Claude 4.6 Opus reasoning traces. All datasets used across the full training lineage are listed above.
29
 
30
- ## Training Lineage
31
 
32
- ```
33
- Qwen/Qwen3.5-9B-Base
34
- └─ Qwen/Qwen3.5-9B (instruction tuned)
35
- └─ Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
36
- (SFT on Claude 4.6 Opus reasoning traces for efficient chain-of-thought)
37
- Datasets: nohurry/Opus-4.6-Reasoning-3000x-filtered,
38
- Roman1111111/claude-opus-4.6-10000x,
39
- TeichAI/claude-4.5-opus-high-reasoning-250x,
40
- Jackrong/Qwen3.5-reasoning-700x
41
- └─ danielcherubini/Qwen3.5-DeltaCoder-9B ← this model
42
- (LoRA SFT on CoderForge-Preview for tool-call reliability)
43
- Dataset: togethercomputer/CoderForge-Preview
44
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
  ## Training Details
47
 
48
- | Parameter | Value |
49
- |-----------|-------|
50
- | Base model | Qwen3.5-9B (hybrid GDN architecture) |
51
- | Method | LoRA (r=64, alpha=32) |
52
- | Dataset | CoderForge-Preview `filtered_reward1` (50K subset) |
53
- | Sequence length | 4096 |
54
- | Batch size | 2 (effective 16 with grad accum 8) |
55
- | Learning rate | 1e-4 (cosine schedule) |
56
- | Epochs | 1 |
57
- | Optimizer | AdamW |
58
- | Precision | BF16 |
59
- | Hardware | NVIDIA H200 140GB |
60
- | Training time | ~10 hours |
61
- | Framework | Unsloth 2026.3.10 + HuggingFace Transformers 5.3.0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
- ### LoRA Target Modules
64
 
65
- All major weight matrices are adapted:
66
- - **Full Attention** (8/32 layers): `q_proj`, `k_proj`, `v_proj`, `o_proj`
67
- - **Gated Delta Net** (24/32 layers): `in_proj_qkv`, `in_proj_z`, `in_proj_b`, `in_proj_a`, `out_proj`
68
- - **MLP** (all 32 layers): `gate_proj`, `up_proj`, `down_proj`
69
 
70
- ### Training Loss
71
 
72
- Final training loss: ~0.94 (average: 1.268), decreasing steadily over training.
73
 
74
- ## Recommended Sampling Settings
75
 
76
- Validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
77
 
78
- | Profile | temperature | top_k | top_p | min_p | presence_penalty |
79
- |---------|-------------|-------|-------|-------|-----------------|
80
- | **Coding** | 0.6 | 20 | 0.95 | 0.0 | 0.0 |
81
- | **Chat** | 1.0 | 20 | 0.95 | 0.0 | 1.5 |
82
 
83
- > [!WARNING]
84
- > **Do not use temperature below 0.5** — low temperatures (e.g., 0.3) cause deterministic looping in multi-turn agentic use, where the model repeats the same tool call indefinitely.
85
 
86
- ### KV Cache Quantization
87
 
88
- For VRAM-constrained GPUs, use quantized KV cache keys/values:
89
 
90
- | Context Length | KV Cache | VRAM (Q4_K_M) | Generation Speed |
91
- |---------------|----------|---------------|-----------------|
92
- | 102,400 | f16/q4_0 | ~8.5 GB | ~111 tok/s |
93
- | 131,072 | f16/q4_0 | ~9.1 GB | ~110 tok/s |
94
 
95
- ```bash
96
- # llama.cpp / ik_llama.cpp flags
97
- -ctk f16 -ctv q4_0
98
- ```
99
 
100
- ## Usage
101
 
102
- ### With PEFT
103
 
104
- ```python
105
- from transformers import AutoModelForCausalLM, AutoTokenizer
106
- from peft import PeftModel
107
 
108
- base_model = AutoModelForCausalLM.from_pretrained(
109
- "Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2",
110
- trust_remote_code=True,
111
- )
112
- model = PeftModel.from_pretrained(base_model, "danielcherubini/Qwen3.5-DeltaCoder-9B")
113
- tokenizer = AutoTokenizer.from_pretrained("danielcherubini/Qwen3.5-DeltaCoder-9B")
114
- ```
115
 
116
- ### GGUF (Ollama / llama.cpp / LM Studio)
117
 
118
- Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
119
 
120
- ## Benchmarks
121
 
122
- | Model | HumanEval | HumanEval+ |
123
- |-------|-----------|------------|
124
- | Jackrong v2 (base) | 53.7% | — |
125
- | **DeltaCoder-9B** (temp=0.6) | **50.6%** | **49.4%** |
126
- | DeltaCoder-9B (greedy) | 43.9% | 42.1% |
127
 
128
- Terminal-Bench easy tasks: **2/4 (50%)** — use recommended sampling settings (temp=0.6).
129
 
130
- ## Intended Use
131
 
132
- This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
133
 
134
- ## Acknowledgements
 
135
 
136
- - [Unsloth](https://unsloth.ai) for Qwen3.5 training support
137
- - [Together AI](https://together.ai) for the CoderForge dataset
138
- - [Jackrong](https://huggingface.co/Jackrong) for the reasoning distillation
139
- - [Qwen](https://huggingface.co/Qwen) for the base model
 
1
  ---
 
2
  base_model: Qwen/Qwen3.5-9B
3
+ library_name: peft
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  pipeline_tag: text-generation
5
+ tags:
6
+ - base_model:adapter:Qwen/Qwen3.5-9B
7
+ - dpo
8
+ - lora
9
+ - transformers
10
+ - trl
11
  ---
12
 
13
+ # Model Card for Model ID
14
+
15
+ <!-- Provide a quick summary of what the model is/does. -->
16
+
17
+
18
+
19
+ ## Model Details
20
+
21
+ ### Model Description
22
+
23
+ <!-- Provide a longer summary of what this model is. -->
24
+
25
+
26
+
27
+ - **Developed by:** [More Information Needed]
28
+ - **Funded by [optional]:** [More Information Needed]
29
+ - **Shared by [optional]:** [More Information Needed]
30
+ - **Model type:** [More Information Needed]
31
+ - **Language(s) (NLP):** [More Information Needed]
32
+ - **License:** [More Information Needed]
33
+ - **Finetuned from model [optional]:** [More Information Needed]
34
+
35
+ ### Model Sources [optional]
36
+
37
+ <!-- Provide the basic links for the model. -->
38
+
39
+ - **Repository:** [More Information Needed]
40
+ - **Paper [optional]:** [More Information Needed]
41
+ - **Demo [optional]:** [More Information Needed]
42
+
43
+ ## Uses
44
+
45
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
+
47
+ ### Direct Use
48
 
49
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
 
51
+ [More Information Needed]
52
 
53
+ ### Downstream Use [optional]
54
 
55
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
56
+
57
+ [More Information Needed]
58
+
59
+ ### Out-of-Scope Use
60
+
61
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
62
+
63
+ [More Information Needed]
64
+
65
+ ## Bias, Risks, and Limitations
66
+
67
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
+
69
+ [More Information Needed]
70
+
71
+ ### Recommendations
72
+
73
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
74
+
75
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
76
+
77
+ ## How to Get Started with the Model
78
+
79
+ Use the code below to get started with the model.
80
+
81
+ [More Information Needed]
82
 
83
  ## Training Details
84
 
85
+ ### Training Data
86
+
87
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
88
+
89
+ [More Information Needed]
90
+
91
+ ### Training Procedure
92
+
93
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
+
95
+ #### Preprocessing [optional]
96
+
97
+ [More Information Needed]
98
+
99
+
100
+ #### Training Hyperparameters
101
+
102
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
103
+
104
+ #### Speeds, Sizes, Times [optional]
105
+
106
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
107
+
108
+ [More Information Needed]
109
+
110
+ ## Evaluation
111
+
112
+ <!-- This section describes the evaluation protocols and provides the results. -->
113
+
114
+ ### Testing Data, Factors & Metrics
115
+
116
+ #### Testing Data
117
+
118
+ <!-- This should link to a Dataset Card if possible. -->
119
+
120
+ [More Information Needed]
121
+
122
+ #### Factors
123
+
124
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
125
+
126
+ [More Information Needed]
127
+
128
+ #### Metrics
129
+
130
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
131
+
132
+ [More Information Needed]
133
+
134
+ ### Results
135
+
136
+ [More Information Needed]
137
+
138
+ #### Summary
139
+
140
+
141
+
142
+ ## Model Examination [optional]
143
+
144
+ <!-- Relevant interpretability work for the model goes here -->
145
+
146
+ [More Information Needed]
147
+
148
+ ## Environmental Impact
149
+
150
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
151
+
152
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
153
+
154
+ - **Hardware Type:** [More Information Needed]
155
+ - **Hours used:** [More Information Needed]
156
+ - **Cloud Provider:** [More Information Needed]
157
+ - **Compute Region:** [More Information Needed]
158
+ - **Carbon Emitted:** [More Information Needed]
159
 
160
+ ## Technical Specifications [optional]
161
 
162
+ ### Model Architecture and Objective
 
 
 
163
 
164
+ [More Information Needed]
165
 
166
+ ### Compute Infrastructure
167
 
168
+ [More Information Needed]
169
 
170
+ #### Hardware
171
 
172
+ [More Information Needed]
 
 
 
173
 
174
+ #### Software
 
175
 
176
+ [More Information Needed]
177
 
178
+ ## Citation [optional]
179
 
180
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 
 
 
181
 
182
+ **BibTeX:**
 
 
 
183
 
184
+ [More Information Needed]
185
 
186
+ **APA:**
187
 
188
+ [More Information Needed]
 
 
189
 
190
+ ## Glossary [optional]
 
 
 
 
 
 
191
 
192
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
193
 
194
+ [More Information Needed]
195
 
196
+ ## More Information [optional]
197
 
198
+ [More Information Needed]
 
 
 
 
199
 
200
+ ## Model Card Authors [optional]
201
 
202
+ [More Information Needed]
203
 
204
+ ## Model Card Contact
205
 
206
+ [More Information Needed]
207
+ ### Framework versions
208
 
209
+ - PEFT 0.18.1
 
 
 
adapter_config.json CHANGED
@@ -2,12 +2,8 @@
2
  "alora_invocation_tokens": null,
3
  "alpha_pattern": {},
4
  "arrow_config": null,
5
- "auto_mapping": {
6
- "base_model_class": "Qwen3_5ForConditionalGeneration",
7
- "parent_library": "transformers.models.qwen3_5.modeling_qwen3_5",
8
- "unsloth_fixed": true
9
- },
10
- "base_model_name_or_path": "Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2",
11
  "bias": "none",
12
  "corda_config": null,
13
  "ensure_weight_tying": false,
@@ -29,22 +25,22 @@
29
  "peft_type": "LORA",
30
  "peft_version": "0.18.1",
31
  "qalora_group_size": 16,
32
- "r": 64,
33
  "rank_pattern": {},
34
  "revision": null,
35
  "target_modules": [
 
 
 
 
36
  "in_proj_a",
37
  "up_proj",
38
- "k_proj",
39
- "in_proj_z",
40
- "gate_proj",
41
- "down_proj",
42
  "in_proj_b",
43
- "o_proj",
44
- "q_proj",
45
- "in_proj_qkv",
46
  "v_proj",
47
- "out_proj"
 
 
48
  ],
49
  "target_parameters": null,
50
  "task_type": "CAUSAL_LM",
 
2
  "alora_invocation_tokens": null,
3
  "alpha_pattern": {},
4
  "arrow_config": null,
5
+ "auto_mapping": null,
6
+ "base_model_name_or_path": "Qwen/Qwen3.5-9B",
 
 
 
 
7
  "bias": "none",
8
  "corda_config": null,
9
  "ensure_weight_tying": false,
 
25
  "peft_type": "LORA",
26
  "peft_version": "0.18.1",
27
  "qalora_group_size": 16,
28
+ "r": 32,
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
32
+ "in_proj_qkv",
33
+ "in_proj_z",
34
+ "q_proj",
35
+ "k_proj",
36
  "in_proj_a",
37
  "up_proj",
 
 
 
 
38
  "in_proj_b",
39
+ "out_proj",
 
 
40
  "v_proj",
41
+ "o_proj",
42
+ "gate_proj",
43
+ "down_proj"
44
  ],
45
  "target_parameters": null,
46
  "task_type": "CAUSAL_LM",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:87de215f5f35a302a4cad1e4db1f2971a2363b54c1d78b1da645cd1593ac8b9e
3
- size 692529024
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ecb70be8d733f91125d4dffdb5747cdca9425915f270e7851a5d8c6f6a2b2fa
3
+ size 346294736
chat_template.jinja CHANGED
@@ -1,34 +1,90 @@
1
- {%- if tools %}
2
- {{- '<|im_start|>system\n' }}
3
- {%- if messages[0].role == 'system' %}
4
- {{- messages[0].content + '\n\n' }}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  {%- endif %}
6
- {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
 
 
 
 
 
 
7
  {%- for tool in tools %}
8
  {{- "\n" }}
9
  {{- tool | tojson }}
10
  {%- endfor %}
11
- {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
 
 
 
 
 
 
 
 
12
  {%- else %}
13
  {%- if messages[0].role == 'system' %}
14
- {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
 
15
  {%- endif %}
16
  {%- endif %}
17
  {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
18
  {%- for message in messages[::-1] %}
19
  {%- set index = (messages|length - 1) - loop.index0 %}
20
- {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
21
- {%- set ns.multi_step_tool = false %}
22
- {%- set ns.last_query_index = index %}
 
 
 
23
  {%- endif %}
24
  {%- endfor %}
 
 
 
25
  {%- for message in messages %}
26
- {%- if message.content is string %}
27
- {%- set content = message.content %}
28
- {%- else %}
29
- {%- set content = '' %}
30
- {%- endif %}
31
- {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
32
  {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
33
  {%- elif message.role == "assistant" %}
34
  {%- set reasoning_content = '' %}
@@ -40,49 +96,59 @@
40
  {%- set content = content.split('</think>')[-1].lstrip('\n') %}
41
  {%- endif %}
42
  {%- endif %}
 
43
  {%- if loop.index0 > ns.last_query_index %}
44
- {%- if loop.last or (not loop.last and reasoning_content) %}
45
- {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
46
- {%- else %}
47
- {{- '<|im_start|>' + message.role + '\n' + content }}
48
- {%- endif %}
49
  {%- else %}
50
  {{- '<|im_start|>' + message.role + '\n' + content }}
51
  {%- endif %}
52
- {%- if message.tool_calls %}
53
  {%- for tool_call in message.tool_calls %}
54
- {%- if (loop.first and content) or (not loop.first) %}
55
- {{- '\n' }}
56
- {%- endif %}
57
- {%- if tool_call.function %}
58
  {%- set tool_call = tool_call.function %}
59
  {%- endif %}
60
- {{- '<tool_call>\n{"name": "' }}
61
- {{- tool_call.name }}
62
- {{- '", "arguments": ' }}
63
- {%- if tool_call.arguments is string %}
64
- {{- tool_call.arguments }}
 
65
  {%- else %}
66
- {{- tool_call.arguments | tojson }}
67
  {%- endif %}
68
- {{- '}\n</tool_call>' }}
 
 
 
 
 
 
 
 
69
  {%- endfor %}
70
  {%- endif %}
71
  {{- '<|im_end|>\n' }}
72
  {%- elif message.role == "tool" %}
73
- {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
74
  {{- '<|im_start|>user' }}
75
  {%- endif %}
76
  {{- '\n<tool_response>\n' }}
77
  {{- content }}
78
  {{- '\n</tool_response>' }}
79
- {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
 
 
80
  {{- '<|im_end|>\n' }}
81
  {%- endif %}
 
 
82
  {%- endif %}
83
  {%- endfor %}
84
  {%- if add_generation_prompt %}
85
- {{- '<|im_start|>assistant
86
- <think>
87
- ' }}
 
 
 
88
  {%- endif %}
 
1
+ {%- set image_count = namespace(value=0) %}
2
+ {%- set video_count = namespace(value=0) %}
3
+ {%- macro render_content(content, do_vision_count, is_system_content=false) %}
4
+ {%- if content is string %}
5
+ {{- content }}
6
+ {%- elif content is iterable and content is not mapping %}
7
+ {%- for item in content %}
8
+ {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
9
+ {%- if is_system_content %}
10
+ {{- raise_exception('System message cannot contain images.') }}
11
+ {%- endif %}
12
+ {%- if do_vision_count %}
13
+ {%- set image_count.value = image_count.value + 1 %}
14
+ {%- endif %}
15
+ {%- if add_vision_id %}
16
+ {{- 'Picture ' ~ image_count.value ~ ': ' }}
17
+ {%- endif %}
18
+ {{- '<|vision_start|><|image_pad|><|vision_end|>' }}
19
+ {%- elif 'video' in item or item.type == 'video' %}
20
+ {%- if is_system_content %}
21
+ {{- raise_exception('System message cannot contain videos.') }}
22
+ {%- endif %}
23
+ {%- if do_vision_count %}
24
+ {%- set video_count.value = video_count.value + 1 %}
25
+ {%- endif %}
26
+ {%- if add_vision_id %}
27
+ {{- 'Video ' ~ video_count.value ~ ': ' }}
28
+ {%- endif %}
29
+ {{- '<|vision_start|><|video_pad|><|vision_end|>' }}
30
+ {%- elif 'text' in item %}
31
+ {{- item.text }}
32
+ {%- else %}
33
+ {{- raise_exception('Unexpected item type in content.') }}
34
+ {%- endif %}
35
+ {%- endfor %}
36
+ {%- elif content is none or content is undefined %}
37
+ {{- '' }}
38
+ {%- else %}
39
+ {{- raise_exception('Unexpected content type.') }}
40
  {%- endif %}
41
+ {%- endmacro %}
42
+ {%- if not messages %}
43
+ {{- raise_exception('No messages provided.') }}
44
+ {%- endif %}
45
+ {%- if tools and tools is iterable and tools is not mapping %}
46
+ {{- '<|im_start|>system\n' }}
47
+ {{- "# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
48
  {%- for tool in tools %}
49
  {{- "\n" }}
50
  {{- tool | tojson }}
51
  {%- endfor %}
52
+ {{- "\n</tools>" }}
53
+ {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
54
+ {%- if messages[0].role == 'system' %}
55
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
56
+ {%- if content %}
57
+ {{- '\n\n' + content }}
58
+ {%- endif %}
59
+ {%- endif %}
60
+ {{- '<|im_end|>\n' }}
61
  {%- else %}
62
  {%- if messages[0].role == 'system' %}
63
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
64
+ {{- '<|im_start|>system\n' + content + '<|im_end|>\n' }}
65
  {%- endif %}
66
  {%- endif %}
67
  {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
68
  {%- for message in messages[::-1] %}
69
  {%- set index = (messages|length - 1) - loop.index0 %}
70
+ {%- if ns.multi_step_tool and message.role == "user" %}
71
+ {%- set content = render_content(message.content, false)|trim %}
72
+ {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
73
+ {%- set ns.multi_step_tool = false %}
74
+ {%- set ns.last_query_index = index %}
75
+ {%- endif %}
76
  {%- endif %}
77
  {%- endfor %}
78
+ {%- if ns.multi_step_tool %}
79
+ {{- raise_exception('No user query found in messages.') }}
80
+ {%- endif %}
81
  {%- for message in messages %}
82
+ {%- set content = render_content(message.content, true)|trim %}
83
+ {%- if message.role == "system" %}
84
+ {%- if not loop.first %}
85
+ {{- raise_exception('System message must be at the beginning.') }}
86
+ {%- endif %}
87
+ {%- elif message.role == "user" %}
88
  {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
89
  {%- elif message.role == "assistant" %}
90
  {%- set reasoning_content = '' %}
 
96
  {%- set content = content.split('</think>')[-1].lstrip('\n') %}
97
  {%- endif %}
98
  {%- endif %}
99
+ {%- set reasoning_content = reasoning_content|trim %}
100
  {%- if loop.index0 > ns.last_query_index %}
101
+ {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
 
 
 
 
102
  {%- else %}
103
  {{- '<|im_start|>' + message.role + '\n' + content }}
104
  {%- endif %}
105
+ {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
106
  {%- for tool_call in message.tool_calls %}
107
+ {%- if tool_call.function is defined %}
 
 
 
108
  {%- set tool_call = tool_call.function %}
109
  {%- endif %}
110
+ {%- if loop.first %}
111
+ {%- if content|trim %}
112
+ {{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
113
+ {%- else %}
114
+ {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
115
+ {%- endif %}
116
  {%- else %}
117
+ {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
118
  {%- endif %}
119
+ {%- if tool_call.arguments is defined %}
120
+ {%- for args_name, args_value in tool_call.arguments|items %}
121
+ {{- '<parameter=' + args_name + '>\n' }}
122
+ {%- set args_value = args_value | tojson | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
123
+ {{- args_value }}
124
+ {{- '\n</parameter>\n' }}
125
+ {%- endfor %}
126
+ {%- endif %}
127
+ {{- '</function>\n</tool_call>' }}
128
  {%- endfor %}
129
  {%- endif %}
130
  {{- '<|im_end|>\n' }}
131
  {%- elif message.role == "tool" %}
132
+ {%- if loop.previtem and loop.previtem.role != "tool" %}
133
  {{- '<|im_start|>user' }}
134
  {%- endif %}
135
  {{- '\n<tool_response>\n' }}
136
  {{- content }}
137
  {{- '\n</tool_response>' }}
138
+ {%- if not loop.last and loop.nextitem.role != "tool" %}
139
+ {{- '<|im_end|>\n' }}
140
+ {%- elif loop.last %}
141
  {{- '<|im_end|>\n' }}
142
  {%- endif %}
143
+ {%- else %}
144
+ {{- raise_exception('Unexpected message role.') }}
145
  {%- endif %}
146
  {%- endfor %}
147
  {%- if add_generation_prompt %}
148
+ {{- '<|im_start|>assistant\n' }}
149
+ {%- if enable_thinking is defined and enable_thinking is false %}
150
+ {{- '<think>\n\n</think>\n\n' }}
151
+ {%- else %}
152
+ {{- '<think>\n' }}
153
+ {%- endif %}
154
  {%- endif %}
ref/adapter_config.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": null,
6
+ "base_model_name_or_path": "Qwen/Qwen3.5-9B",
7
+ "bias": "none",
8
+ "corda_config": null,
9
+ "ensure_weight_tying": false,
10
+ "eva_config": null,
11
+ "exclude_modules": null,
12
+ "fan_in_fan_out": false,
13
+ "inference_mode": true,
14
+ "init_lora_weights": true,
15
+ "layer_replication": null,
16
+ "layers_pattern": null,
17
+ "layers_to_transform": null,
18
+ "loftq_config": {},
19
+ "lora_alpha": 32,
20
+ "lora_bias": false,
21
+ "lora_dropout": 0,
22
+ "megatron_config": null,
23
+ "megatron_core": "megatron.core",
24
+ "modules_to_save": null,
25
+ "peft_type": "LORA",
26
+ "peft_version": "0.18.1",
27
+ "qalora_group_size": 16,
28
+ "r": 32,
29
+ "rank_pattern": {},
30
+ "revision": null,
31
+ "target_modules": [
32
+ "in_proj_qkv",
33
+ "in_proj_z",
34
+ "q_proj",
35
+ "k_proj",
36
+ "in_proj_a",
37
+ "up_proj",
38
+ "in_proj_b",
39
+ "out_proj",
40
+ "v_proj",
41
+ "o_proj",
42
+ "gate_proj",
43
+ "down_proj"
44
+ ],
45
+ "target_parameters": null,
46
+ "task_type": "CAUSAL_LM",
47
+ "trainable_token_indices": null,
48
+ "use_dora": false,
49
+ "use_qalora": false,
50
+ "use_rslora": false
51
+ }
ref/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70ffdd72bab7a4b71e214dade0a0e64ab3b2f01b34649c3fffc11ecf367e6e22
3
+ size 173181568
tokenizer_config.json CHANGED
@@ -20,10 +20,8 @@
20
  "vision_bos_token": "<|vision_start|>",
21
  "vision_eos_token": "<|vision_end|>"
22
  },
23
- "pad_token": "<|vision_pad|>",
24
- "padding_side": "right",
25
  "pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
26
- "processor_class": "Qwen3VLProcessor",
27
  "split_special_tokens": false,
28
  "tokenizer_class": "TokenizersBackend",
29
  "unk_token": null,
 
20
  "vision_bos_token": "<|vision_start|>",
21
  "vision_eos_token": "<|vision_end|>"
22
  },
23
+ "pad_token": "<|endoftext|>",
 
24
  "pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
 
25
  "split_special_tokens": false,
26
  "tokenizer_class": "TokenizersBackend",
27
  "unk_token": null,