File size: 11,688 Bytes
f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 68ada46 05e5a00 68ada46 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 68ada46 05e5a00 68ada46 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 68ada46 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 f2661f1 05e5a00 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | ---
license: apache-2.0
base_model: Qwen/Qwen3.5-4B-Base
library_name: peft
pipeline_tag: text-generation
model_name: Graphite 1.0 4B
language:
- en
- ru
tags:
- qwen
- qwen3.5
- peft
- lora
- unsloth
- trl
- sft
- code
- reasoning
- bilingual
- obsidian
- graphite
---
# Graphite 1.0 4B
`Graphite 1.0 4B` is the first public LoRA adapter from the Graphite / Obsidian-Critic training stream. It is built on top of [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base) and tuned for strict, grounded, low-noise responses across:
- repo repair and debugging
- agent tool-use formatting
- technical writing and Markdown workflows
- code review and integration tasks
- logic and factual precision
- bilingual Russian / English instruction following
## What This Repository Contains
This repo contains a **LoRA adapter**, not merged base weights.
- Base model: `Qwen/Qwen3.5-4B-Base`
- Adapter type: `LoRA`
- Rank: `r=16`
- Alpha: `16`
- Dropout: `0.0`
- Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Files of interest:
- `adapter_model.safetensors`: LoRA weights
- `adapter_config.json`: PEFT adapter config
- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer assets
- `run_summary.json`: public training run summary
- `length_stats.json`: length filtering summary
- `masking_sanity.json`: formatting sanity check
## Training Lineage
This adapter corresponds to the **first public Graphite 1.0 4B full fine-tune stream**.
- dataset family: **`obsidian-critic-broad-mix-20260321`**
- training stack: **Unsloth + TRL + torchrun DDP**
- base model: **`Qwen/Qwen3.5-4B-Base`**
Notebook lineage used for this stream:
- `obsidian_critic_qwen35_t4x2_unsloth_kaggle.ipynb`: smoke-test notebook for the broad mix
- `obsidian_critic_qwen35_t4x2_unsloth_kaggle_full.ipynb`: full fine-tune lineage used to produce the public LoRA run
## Dataset Provenance
The training data for this first public stream comes from the mixed dataset:
- dataset name: `obsidian-critic-broad-mix-20260321`
- examples in mixed dataset: `37,008`
- approximate token volume: `6,885,960`
- exact duplicate `(user, assistant)` pairs removed during mix build: `3,469`
- normalized near-duplicates removed from wave backfill rows: `201`
- dataset SHA-256: `5ba1924b46d08a8ab8ad7ed5e1f74b13cc3e847b3a04b714934953975fd9300a`
The public training run then created a deterministic train / validation split and applied sequence-length filtering:
- train rows before filter: `36,638`
- validation rows before filter: `370`
- train rows after filter: `36,081`
- validation rows after filter: `363`
- removed for length filtering: `564`
- minimum kept sequence length: `48`
- maximum kept sequence length: `2048`
### Mix Roles
| Role | Examples | Approx. tokens |
| --- | ---: | ---: |
| `repair` | 5,353 | 983,890 |
| `tool_use` | 4,682 | 455,600 |
| `core_real` | 4,200 | 1,043,187 |
| `robustness` | 3,600 | 397,399 |
| `agent_core` | 3,200 | 645,426 |
| `logic` | 3,031 | 297,812 |
| `factual` | 2,960 | 142,787 |
| `obsidian_docs` | 2,740 | 490,850 |
| `reasoning` | 2,200 | 655,624 |
| `greenfield` | 1,488 | 563,331 |
| `integration` | 1,473 | 241,331 |
| `review` | 1,327 | 343,007 |
| `regularizer` | 500 | 55,840 |
| `wave_backfill` | 230 | 218,922 |
| `long_context` | 24 | 350,954 |
### Source Dataset Table
| Dataset | Role | Examples | Approx. tokens |
| --- | --- | ---: | ---: |
| `real-world-grounded-topup-sft-20260320` | `core_real` | 3,000 | 804,280 |
| `robustness-noise-traps-sft-20260320` | `robustness` | 3,200 | 363,139 |
| `factual-erudition-sft-20260319` | `factual` | 2,960 | 142,787 |
| `agent-gap-fixes-sft-20260320` | `agent_core` | 2,500 | 555,785 |
| `code-fix-critical-topup-sft-20260321` | `repair` | 2,500 | 440,355 |
| `code-agent-tooluse-sft-20260319` | `tool_use` | 2,400 | 208,335 |
| `docs-engineering-review-topup-sft-20260320` | `obsidian_docs` | 1,600 | 206,794 |
| `format-tool-discipline-sft-20260319` | `tool_use` | 1,582 | 179,009 |
| `multi-step-debug-sft-20260319` | `reasoning` | 1,200 | 345,084 |
| `real-world-seed-expansion-sft-20260321` | `core_real` | 1,200 | 238,907 |
| `runtime-debug-grounded-sft-20260319` | `repair` | 1,193 | 218,845 |
| `logic-core-sft-20260319` | `logic` | 1,131 | 139,188 |
| `code-architecture-sft-20260319` | `greenfield` | 1,100 | 308,380 |
| `tdd-test-first-sft-20260319` | `reasoning` | 1,000 | 310,540 |
| `logic-sanity-sft-20260319` | `logic` | 996 | 69,260 |
| `code-repair-patch-sft-20260319` | `repair` | 955 | 145,544 |
| `logic-precision-ru-sft-20260319` | `logic` | 904 | 89,364 |
| `security-repair-review-sft-20260319` | `review` | 893 | 159,086 |
| `db-and-migrations-sft-20260319` | `integration` | 867 | 119,740 |
| `agent-gap-fixes-ru-topup-sft-20260320` | `agent_core` | 700 | 89,641 |
| `code-agent-tooluse-ru-topup-sft-20260320` | `tool_use` | 700 | 68,256 |
| `multi-file-repo-repair-sft-20260319` | `repair` | 705 | 179,146 |
| `backend-frontend-ops-sft-20260319` | `integration` | 606 | 121,591 |
| `docs-topup-sft-20260320` | `obsidian_docs` | 600 | 125,125 |
| `anti-overthinking-pack-sft-20260321` | `regularizer` | 500 | 55,840 |
| `docs-markdown-sft-20260318-v3` | `obsidian_docs` | 440 | 149,852 |
| `ts-rust-code-review-sft-20260318-v3` | `review` | 434 | 183,921 |
| `robustness-noise-traps-ru-topup-sft-20260320` | `robustness` | 400 | 34,260 |
| `ts-rust-coding-sft-20260318-v3` | `greenfield` | 388 | 254,951 |
| `wave-03-growth-sft-20260320` | `wave_backfill` | 230 | 218,922 |
| `docs-topup-ru-sft-20260320` | `obsidian_docs` | 100 | 9,079 |
| `long-context-memory-topup-sft-20260321` | `long_context` | 24 | 350,954 |
## Representative Training Examples
These are short excerpts from the actual broad-mix and component datasets used in the first public stream.
### Broad mix: grounded repo repair
**System**
```text
Ground the answer in one active boundary, not in general advice.
```
**User**
```text
First fix failed. Repo: packages/notes-index
Patch 1 removed one fallback in `app/config.py`.
New evidence: `docker compose: service worker exited with code 1`.
Need the second minimal fix.
```
**Assistant**
```text
The first patch hit the wrong seam. The new signal points back to `app/config.py`.
```
### Broad mix: Russian ambiguity tightening
**System**
```text
Пиши как строгий русскоязычный логический редактор: коротко, ясно, без филологии ради филологии.
```
**User**
```text
Оцени формулировку из инструкции:
`Разрешён доступ аналитики и владельцы без резервной копией.`
```
**Assistant**
```text
Да. Неясно, относится `без резервной копией` к обеим группам или только ко второй.
Строже: `И аналитики, и владельцы получают доступ без резервной копией.`
```
### Tool-use source: strict JSON action selection
**System**
```text
Ты code-workspace агент. Возвращай только валидный JSON.
```
**User**
```text
Исправь этот сломанный tool call и верни только валидный JSON.
{"tool":"search_codebase","args":{"query":"normalize_note_path","path":"src","limit":"4"}}
```
**Assistant**
```json
{"tool":"search_codebase","arguments":{"query":"normalize_note_path","path":"src"}}
```
### Docs source: structured README generation
**System**
```text
Ты technical writer для engineering workflows. Только полезная структура.
```
**User**
```text
Напиши полноценный README.md для `md-audit`.
Контекст: scan Markdown repositories for broken links, empty headings, duplicate slugs, and metadata regressions.
```
### Logic source: contradiction checking
**System**
```text
Будь аккуратен с условиями и кванторами. Предпочитай точность эффектности.
```
**User**
```text
Проверь набор на непротиворечивость.
1. `Борис` гость -> не может публиковать.
2. `Борис` гость.
3. `Борис` может публиковать.
```
**Assistant**
```text
Нет. Противоречие между выводом из 1+2 и пунктом 3.
```
## Training Recipe
The public run used:
- distributed setup: **`torchrun` DDP**
- training framework: **Unsloth + TRL**
- base model loading: **4-bit**
- LoRA rank: **16**
- LoRA alpha: **16**
- LoRA dropout: **0.0**
- max sequence length: **2048**
- per-device train batch size: **1**
- gradient accumulation steps: **8**
- effective global batch size: **16** examples / optimization step
- epochs: **1**
- optimizer: **`adamw_8bit`**
- scheduler: **cosine**
- learning rate: **1e-4**
- warmup steps: **5**
- gradient checkpointing: **enabled**
- FP16: **forced**
- packing: **disabled**
- completion-only loss: **disabled**
- public run total steps: **2256**
- logging / eval / save cadence: **50 / 125 / 250**
## Prompt Style
This adapter was trained on a simple, explicit prompt layout:
```text
System:
<system prompt>
User:
<user prompt>
Assistant:
```
For best results, keep prompts concise, grounded, and task-shaped. The adapter responds best to:
- repo repair tasks with concrete evidence
- exact wording / logic cleanup tasks
- tool-call selection with explicit schemas
- technical writing with clear requested sections
- review / integration prompts that specify files, symptoms, and expected outcomes
## Quick Start
```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_id = "Qwen/Qwen3.5-4B-Base"
adapter_id = "Starred09/obsidian-critic-qwen35-4b-base-lora"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_id,
torch_dtype="auto",
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)
system = "Return the smallest useful answer. Do not invent missing evidence."
user = "Repo: apps/desktop-shell. Build fails with ENOENT on dist/server.js. Point to the first file to inspect."
prompt = f"System:\\n{system}\\n\\nUser:\\n{user}\\n\\nAssistant:\\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=160)
print(tokenizer.decode(out[0], skip_special_tokens=True))
```
## Intended Use
Graphite 1.0 4B is intended for:
- coding assistants
- repo triage and patch-planning copilots
- Markdown / docs tooling assistants
- logic and wording critique
- bilingual technical task routing
It is especially useful when you want **short, grounded, non-theatrical outputs** instead of generic assistant prose.
## Limitations
- This is an **adapter**, not a standalone merged model.
- It is tuned for **structured technical work**, not general consumer chat.
- It inherits both strengths and weaknesses from `Qwen/Qwen3.5-4B-Base`.
- The broad mix is intentionally heavy on repair, tool-use, and reasoning, so purely creative behavior is not a target.
## License
This repository is released under **Apache License 2.0**. See [`LICENSE`](./LICENSE).
Please also review the license and usage terms of the base model:
- [`Qwen/Qwen3.5-4B-Base`](https://huggingface.co/Qwen/Qwen3.5-4B-Base)
## Acknowledgements
- Alibaba Qwen team for the base model
- Unsloth for the efficient LoRA training stack
- TRL / Transformers / PEFT / PyTorch maintainers
|