Update model card: set base_model to Qwen/Qwen3.5-9B, add full dataset lineage
Browse files
README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
-
base_model:
|
| 4 |
tags:
|
| 5 |
- qwen3.5
|
| 6 |
- code
|
|
@@ -8,7 +8,13 @@ tags:
|
|
| 8 |
- lora
|
| 9 |
- sft
|
| 10 |
- unsloth
|
|
|
|
|
|
|
| 11 |
datasets:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
- togethercomputer/CoderForge-Preview
|
| 13 |
language:
|
| 14 |
- en
|
|
@@ -17,7 +23,25 @@ pipeline_tag: text-generation
|
|
| 17 |
|
| 18 |
# Qwen3.5-DeltaCoder-9B
|
| 19 |
|
| 20 |
-
A LoRA fine-tune of [Qwen3.5-9B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
## Training Details
|
| 23 |
|
|
@@ -49,7 +73,7 @@ Final training loss: ~0.94 (average: 1.268), decreasing steadily over training.
|
|
| 49 |
|
| 50 |
## Recommended Sampling Settings
|
| 51 |
|
| 52 |
-
|
| 53 |
|
| 54 |
| Profile | temperature | top_k | top_p | min_p | presence_penalty |
|
| 55 |
|---------|-------------|-------|-------|-------|-----------------|
|
|
@@ -93,6 +117,16 @@ tokenizer = AutoTokenizer.from_pretrained("danielcherubini/Qwen3.5-DeltaCoder-9B
|
|
| 93 |
|
| 94 |
Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
|
| 95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
## Intended Use
|
| 97 |
|
| 98 |
This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
|
|
@@ -101,4 +135,5 @@ This model is designed for AI coding agents that rely on structured tool calls (
|
|
| 101 |
|
| 102 |
- [Unsloth](https://unsloth.ai) for Qwen3.5 training support
|
| 103 |
- [Together AI](https://together.ai) for the CoderForge dataset
|
| 104 |
-
- [Jackrong](https://huggingface.co/Jackrong) for the
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
base_model: Qwen/Qwen3.5-9B
|
| 4 |
tags:
|
| 5 |
- qwen3.5
|
| 6 |
- code
|
|
|
|
| 8 |
- lora
|
| 9 |
- sft
|
| 10 |
- unsloth
|
| 11 |
+
- reasoning
|
| 12 |
+
- chain-of-thought
|
| 13 |
datasets:
|
| 14 |
+
- nohurry/Opus-4.6-Reasoning-3000x-filtered
|
| 15 |
+
- Roman1111111/claude-opus-4.6-10000x
|
| 16 |
+
- TeichAI/claude-4.5-opus-high-reasoning-250x
|
| 17 |
+
- Jackrong/Qwen3.5-reasoning-700x
|
| 18 |
- togethercomputer/CoderForge-Preview
|
| 19 |
language:
|
| 20 |
- en
|
|
|
|
| 23 |
|
| 24 |
# Qwen3.5-DeltaCoder-9B
|
| 25 |
|
| 26 |
+
A LoRA fine-tune of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
|
| 27 |
+
|
| 28 |
+
The fine-tune builds on top of [Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2), a reasoning distillation of Qwen3.5-9B trained on Claude 4.6 Opus reasoning traces. All datasets used across the full training lineage are listed above.
|
| 29 |
+
|
| 30 |
+
## Training Lineage
|
| 31 |
+
|
| 32 |
+
```
|
| 33 |
+
Qwen/Qwen3.5-9B-Base
|
| 34 |
+
└─ Qwen/Qwen3.5-9B (instruction tuned)
|
| 35 |
+
└─ Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
|
| 36 |
+
(SFT on Claude 4.6 Opus reasoning traces for efficient chain-of-thought)
|
| 37 |
+
Datasets: nohurry/Opus-4.6-Reasoning-3000x-filtered,
|
| 38 |
+
Roman1111111/claude-opus-4.6-10000x,
|
| 39 |
+
TeichAI/claude-4.5-opus-high-reasoning-250x,
|
| 40 |
+
Jackrong/Qwen3.5-reasoning-700x
|
| 41 |
+
└─ danielcherubini/Qwen3.5-DeltaCoder-9B ← this model
|
| 42 |
+
(LoRA SFT on CoderForge-Preview for tool-call reliability)
|
| 43 |
+
Dataset: togethercomputer/CoderForge-Preview
|
| 44 |
+
```
|
| 45 |
|
| 46 |
## Training Details
|
| 47 |
|
|
|
|
| 73 |
|
| 74 |
## Recommended Sampling Settings
|
| 75 |
|
| 76 |
+
Validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
|
| 77 |
|
| 78 |
| Profile | temperature | top_k | top_p | min_p | presence_penalty |
|
| 79 |
|---------|-------------|-------|-------|-------|-----------------|
|
|
|
|
| 117 |
|
| 118 |
Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
|
| 119 |
|
| 120 |
+
## Benchmarks
|
| 121 |
+
|
| 122 |
+
| Model | HumanEval | HumanEval+ |
|
| 123 |
+
|-------|-----------|------------|
|
| 124 |
+
| Jackrong v2 (base) | 53.7% | — |
|
| 125 |
+
| **DeltaCoder-9B** (temp=0.6) | **50.6%** | **49.4%** |
|
| 126 |
+
| DeltaCoder-9B (greedy) | 43.9% | 42.1% |
|
| 127 |
+
|
| 128 |
+
Terminal-Bench easy tasks: **2/4 (50%)** — use recommended sampling settings (temp=0.6).
|
| 129 |
+
|
| 130 |
## Intended Use
|
| 131 |
|
| 132 |
This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
|
|
|
|
| 135 |
|
| 136 |
- [Unsloth](https://unsloth.ai) for Qwen3.5 training support
|
| 137 |
- [Together AI](https://together.ai) for the CoderForge dataset
|
| 138 |
+
- [Jackrong](https://huggingface.co/Jackrong) for the reasoning distillation
|
| 139 |
+
- [Qwen](https://huggingface.co/Qwen) for the base model
|