danielcherubini
/

Qwen3.5-DeltaCoder-9B

@@ -1,6 +1,6 @@
 ---
 license: apache-2.0
-base_model: Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
 tags:
   - qwen3.5
   - code
@@ -8,7 +8,13 @@ tags:
   - lora
   - sft
   - unsloth
 datasets:
   - togethercomputer/CoderForge-Preview
 language:
   - en
@@ -17,7 +23,25 @@ pipeline_tag: text-generation
 # Qwen3.5-DeltaCoder-9B
-A LoRA fine-tune of [Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
 ## Training Details
@@ -49,7 +73,7 @@ Final training loss: ~0.94 (average: 1.268), decreasing steadily over training.
 ## Recommended Sampling Settings
-These settings were validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
 | Profile | temperature | top_k | top_p | min_p | presence_penalty |
 |---------|-------------|-------|-------|-------|-----------------|
@@ -93,6 +117,16 @@ tokenizer = AutoTokenizer.from_pretrained("danielcherubini/Qwen3.5-DeltaCoder-9B
 Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
 ## Intended Use
 This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
@@ -101,4 +135,5 @@ This model is designed for AI coding agents that rely on structured tool calls (
 - [Unsloth](https://unsloth.ai) for Qwen3.5 training support
 - [Together AI](https://together.ai) for the CoderForge dataset
-- [Jackrong](https://huggingface.co/Jackrong) for the base model

 ---
 license: apache-2.0
+base_model: Qwen/Qwen3.5-9B
 tags:
   - qwen3.5
   - code
   - lora
   - sft
   - unsloth
+  - reasoning
+  - chain-of-thought
 datasets:
+  - nohurry/Opus-4.6-Reasoning-3000x-filtered
+  - Roman1111111/claude-opus-4.6-10000x
+  - TeichAI/claude-4.5-opus-high-reasoning-250x
+  - Jackrong/Qwen3.5-reasoning-700x
   - togethercomputer/CoderForge-Preview
 language:
   - en
 # Qwen3.5-DeltaCoder-9B
+A LoRA fine-tune of [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) trained to improve structured tool-call generation (JSON formatting) for use in coding agents like OpenCode, Pi, and Cline.
+The fine-tune builds on top of [Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2](https://huggingface.co/Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2), a reasoning distillation of Qwen3.5-9B trained on Claude 4.6 Opus reasoning traces. All datasets used across the full training lineage are listed above.
+## Training Lineage
+```
+Qwen/Qwen3.5-9B-Base
+ └─ Qwen/Qwen3.5-9B  (instruction tuned)
+     └─ Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2
+         (SFT on Claude 4.6 Opus reasoning traces for efficient chain-of-thought)
+         Datasets: nohurry/Opus-4.6-Reasoning-3000x-filtered,
+                   Roman1111111/claude-opus-4.6-10000x,
+                   TeichAI/claude-4.5-opus-high-reasoning-250x,
+                   Jackrong/Qwen3.5-reasoning-700x
+         └─ danielcherubini/Qwen3.5-DeltaCoder-9B  ← this model
+             (LoRA SFT on CoderForge-Preview for tool-call reliability)
+             Dataset: togethercomputer/CoderForge-Preview
+```
 ## Training Details
 ## Recommended Sampling Settings
+Validated through testing with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp) and [Kronk](https://github.com/danielcherubini/kronk) on an RTX 3080 10GB.
 | Profile | temperature | top_k | top_p | min_p | presence_penalty |
 |---------|-------------|-------|-------|-------|-----------------|
 Pre-quantized GGUF files available at [danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF](https://huggingface.co/danielcherubini/Qwen3.5-DeltaCoder-9B-GGUF).
+## Benchmarks
+| Model | HumanEval | HumanEval+ |
+|-------|-----------|------------|
+| Jackrong v2 (base) | 53.7% | — |
+| **DeltaCoder-9B** (temp=0.6) | **50.6%** | **49.4%** |
+| DeltaCoder-9B (greedy) | 43.9% | 42.1% |
+Terminal-Bench easy tasks: **2/4 (50%)** — use recommended sampling settings (temp=0.6).
 ## Intended Use
 This model is designed for AI coding agents that rely on structured tool calls (JSON function calling). It improves the base model's ability to generate well-formed tool-call responses in multi-turn agent trajectories.
 - [Unsloth](https://unsloth.ai) for Qwen3.5 training support
 - [Together AI](https://together.ai) for the CoderForge dataset
+- [Jackrong](https://huggingface.co/Jackrong) for the reasoning distillation
+- [Qwen](https://huggingface.co/Qwen) for the base model