danielcherubini commited on
Commit
0bebe4e
·
verified ·
1 Parent(s): 3516ea8

Update model card: set base_model to Qwen/Qwen3.5-9B, add full dataset lineage

Browse files
Files changed (1) hide show
  1. README.md +39 -4
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  license: apache-2.0
3
- base_model: Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
4
  tags:
5
  - qwen3.5
6
  - code
@@ -8,7 +8,13 @@ tags:
8
  - lora
9
  - sft
10
  - unsloth
 
 
11
  datasets:
 
 
 
 
12
  - togethercomputer/CoderForge-Preview
13
  language:
14
  - en
@@ -17,7 +23,25 @@ pipeline_tag: text-generation
17
 
18
  # Qwen3.5-DeltaCoder-9B
19
 
20
- A LoRA fine-tune of [Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Training Details
23
 
@@ -49,7 +73,7 @@ Final training loss: ~0.94 (average: 1.268), decreasing steadily over training.
49
 
50
  ## Recommended Sampling Settings
51
 
52
- These settings were validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
53
 
54
  | Profile | temperature | top_k | top_p | min_p | presence_penalty |
55
  |---------|-------------|-------|-------|-------|-----------------|
@@ -93,6 +117,16 @@ tokenizer = AutoTokenizer.from_pretrained("danielcherubini/Qwen3.5-DeltaCoder-9B
93
 
94
  Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
95
 
 
 
 
 
 
 
 
 
 
 
96
  ## Intended Use
97
 
98
  This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
@@ -101,4 +135,5 @@ This model is designed for AI coding agents that rely on structured tool calls (
101
 
102
  - [Unsloth](https://unsloth.ai) for Qwen3.5 training support
103
  - [Together AI](https://together.ai) for the CoderForge dataset
104
- - [Jackrong](https://huggingface.co/Jackrong) for the base model
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model: Qwen/Qwen3.5-9B
4
  tags:
5
  - qwen3.5
6
  - code
 
8
  - lora
9
  - sft
10
  - unsloth
11
+ - reasoning
12
+ - chain-of-thought
13
  datasets:
14
+ - nohurry/Opus-4.6-Reasoning-3000x-filtered
15
+ - Roman1111111/claude-opus-4.6-10000x
16
+ - TeichAI/claude-4.5-opus-high-reasoning-250x
17
+ - Jackrong/Qwen3.5-reasoning-700x
18
  - togethercomputer/CoderForge-Preview
19
  language:
20
  - en
 
23
 
24
  # Qwen3.5-DeltaCoder-9B
25
 
26
+ A LoRA fine-tune of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
27
+
28
+ The fine-tune builds on top of [Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2), a reasoning distillation of Qwen3.5-9B trained on Claude 4.6 Opus reasoning traces. All datasets used across the full training lineage are listed above.
29
+
30
+ ## Training Lineage
31
+
32
+ ```
33
+ Qwen/Qwen3.5-9B-Base
34
+ └─ Qwen/Qwen3.5-9B (instruction tuned)
35
+ └─ Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
36
+ (SFT on Claude 4.6 Opus reasoning traces for efficient chain-of-thought)
37
+ Datasets: nohurry/Opus-4.6-Reasoning-3000x-filtered,
38
+ Roman1111111/claude-opus-4.6-10000x,
39
+ TeichAI/claude-4.5-opus-high-reasoning-250x,
40
+ Jackrong/Qwen3.5-reasoning-700x
41
+ └─ danielcherubini/Qwen3.5-DeltaCoder-9B ← this model
42
+ (LoRA SFT on CoderForge-Preview for tool-call reliability)
43
+ Dataset: togethercomputer/CoderForge-Preview
44
+ ```
45
 
46
  ## Training Details
47
 
 
73
 
74
  ## Recommended Sampling Settings
75
 
76
+ Validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
77
 
78
  | Profile | temperature | top_k | top_p | min_p | presence_penalty |
79
  |---------|-------------|-------|-------|-------|-----------------|
 
117
 
118
  Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
119
 
120
+ ## Benchmarks
121
+
122
+ | Model | HumanEval | HumanEval+ |
123
+ |-------|-----------|------------|
124
+ | Jackrong v2 (base) | 53.7% | — |
125
+ | **DeltaCoder-9B** (temp=0.6) | **50.6%** | **49.4%** |
126
+ | DeltaCoder-9B (greedy) | 43.9% | 42.1% |
127
+
128
+ Terminal-Bench easy tasks: **2/4 (50%)** — use recommended sampling settings (temp=0.6).
129
+
130
  ## Intended Use
131
 
132
  This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
 
135
 
136
  - [Unsloth](https://unsloth.ai) for Qwen3.5 training support
137
  - [Together AI](https://together.ai) for the CoderForge dataset
138
+ - [Jackrong](https://huggingface.co/Jackrong) for the reasoning distillation
139
+ - [Qwen](https://huggingface.co/Qwen) for the base model